all AI news
Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi
March 15, 2024, 4:48 a.m. | Md Nishat Raihan, Dhiman Goswami, Antara Mahmud
cs.CL updates on arXiv.org arxiv.org
Abstract: One of the most popular downstream tasks in the field of Natural Language Processing is text classification. Text classification tasks have become more daunting when the texts are code-mixed. Though they are not exposed to such text during pre-training, different BERT models have demonstrated success in tackling Code-Mixed NLP challenges. Again, in order to enhance their performance, Code-Mixed NLP models have depended on combining synthetic data with real-world data. It is crucial to understand how …
abstract arxiv become bert bert models classification code cs.cl english hindi language language processing mixed modeling natural natural language natural language processing popular pre-training processing success tasks text text classification training type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Software Engineer, Generative AI (C++)
@ SoundHound Inc. | Toronto, Canada