all AI news
When Does Monolingual Data Help Multilingual Translation: The Role of Domain and Model Scale
April 2, 2024, 7:52 p.m. | Christos Baziotis, Biao Zhang, Alexandra Birch, Barry Haddow
cs.CL updates on arXiv.org arxiv.org
Abstract: Multilingual machine translation (MMT), trained on a mixture of parallel and monolingual data, is key for improving translation in low-resource language pairs. However, the literature offers conflicting results on the performance of different methods of including monolingual data. To resolve this, we examine how denoising autoencoding (DAE) and backtranslation (BT) impact MMT under different data conditions and model scales. Unlike prior studies, we use a realistic dataset of 100 translation directions and consider many domain …
abstract arxiv cs.cl data domain however improving key language literature low machine machine translation multilingual performance results role scale translation type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Business Data Scientist, gTech Ads
@ Google | Mexico City, CDMX, Mexico
Lead, Data Analytics Operations
@ Zocdoc | Pune, Maharashtra, India