all AI news
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes
April 8, 2024, 4:47 a.m. | Bibek Upadhayay, Vahid Behzadan
cs.CL updates on arXiv.org arxiv.org
Abstract: Creating multilingual LLMs poses a significant challenge. Pretraining or fine-tuning LLMs to adopt new languages is evidently very costly. Furthermore, there exist limitations concerning benchmark datasets and the metrics used to measure model performance in multilingual settings. This paper proposes cost-effective solutions to both aforementioned challenges. Firstly, we introduce the Multilingual Instruction-Tuning Dataset (MITS), comprised of Alpaca-52K, Dolly-15K, and Vicuna Benchmark translations into 132 languages. Secondly, we propose a new method called \emph{TaCo: Translation-Assisted Cross-Linguality}, …
arxiv cross-lingual cs.ai cs.cl languages llms low processes thought through transfer translation type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Lead Data Modeler
@ Sherwin-Williams | Cleveland, OH, United States