April 22, 2022, 8:06 a.m. | /u/sanderbaduk

Machine Learning www.reddit.com

I am classifying social media posts (facebook, instagram), with emojis being upwards of 100% of content. For example, you may want to tag "🤮🤮🤮" as in need for moderation, and "🤔🤔🤔" as prioritized for a response.

Looking for a good model to fine tune I found [BerTweet](https://huggingface.co/docs/transformers/model_doc/bertweet), which seems at least somewhat emoji aware. However it also has a ton of out-of-vocabulary results, both for emoji and semi-common English words, despite it's liberal use of emoji.demojize and splitting up more …

emoji good language language model machinelearning

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (CPS-GfK)

@ GfK | Bucharest

Consultant Data Analytics IT Digital Impulse - H/F

@ Talan | Paris, France

Data Analyst

@ Experian | Mumbai, India

Data Scientist

@ Novo Nordisk | Princeton, NJ, US

Data Architect IV

@ Millennium Corporation | United States