all AI news
Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation
March 26, 2024, 4:44 a.m. | Kartik, Sanjana Soni, Anoop Kunchukuttan, Tanmoy Chakraborty, Md Shad Akhtar
cs.LG updates on arXiv.org arxiv.org
Abstract: The widespread online communication in a modern multilingual world has provided opportunities to blend more than one language (aka code-mixed language) in a single utterance. This has resulted a formidable challenge for the computational models due to the scarcity of annotated data and presence of noise. A potential solution to mitigate the data scarcity problem in low-resource setup is to leverage existing data in resource-rich language through translation. In this paper, we tackle the problem …
abstract annotated data arxiv blend challenge code communication computational cs.cl cs.lg data language mixed modern multilingual opportunities robust synthetic synthetic data translation type world
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Data Scientist
@ ITE Management | New York City, United States