all AI news
Curated Datasets and Neural Models for Machine Translation of Informal Registers between Mayan and Spanish Vernaculars
April 12, 2024, 4:42 a.m. | Andr\'es Lou, Juan Antonio P\'erez-Ortiz, Felipe S\'anchez-Mart\'inez, V\'ictor M. S\'anchez-Cartagena
cs.LG updates on arXiv.org arxiv.org
Abstract: The Mayan languages comprise a language family with an ancient history, millions of speakers, and immense cultural value, that, nevertheless, remains severely underrepresented in terms of resources and global exposure. In this paper we develop, curate, and publicly release a set of corpora in several Mayan languages spoken in Guatemala and Southern Mexico, which we call MayanV. The datasets are parallel with Spanish, the dominant language of the region, and are taken from official native …
arxiv cs.cl cs.lg datasets machine machine translation spanish translation type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US