April 30, 2024, 4:50 a.m. | Guoliang Dong, Haoyu Wang, Jun Sun, Xinyu Wang

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.18534v1 Announce Type: new
Abstract: By training on text in various languages, large language models (LLMs) typically possess multilingual support and demonstrate remarkable capabilities in solving tasks described in different languages. However, LLMs can exhibit linguistic discrimination due to the uneven distribution of training data across languages. That is, LLMs are hard to keep the consistency of responses when faced with the same task but depicted in different languages.
In this study, we first explore the consistency in the LLMs' …

abstract arxiv capabilities cs.ai cs.cl cs.cr cs.se data discrimination distribution however language language models languages large language large language models llms multilingual support tasks text training training data type

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US