all AI news
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer
April 8, 2024, 4:46 a.m. | Hele-Andra Kuulmets, Taido Purason, Agnes Luhtaru, Mark Fishel
cs.CL updates on arXiv.org arxiv.org
Abstract: This paper explores cost-efficient methods to adapt pretrained Large Language Models (LLMs) to new lower-resource languages, with a specific focus on Estonian. Leveraging the Llama 2 model, we investigate the impact of combining cross-lingual instruction-tuning with additional monolingual pretraining. Our results demonstrate that even a relatively small amount of additional monolingual pretraining followed by cross-lingual instruction-tuning significantly enhances results on Estonian. Furthermore, we showcase cross-lingual knowledge transfer from high-quality English instructions to Estonian, resulting in …
abstract adapt arxiv cost cross-lingual cs.cl focus impact knowledge language language models languages large language large language models llama llama 2 llama 2 model llms paper pretraining results teaching through transfer type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote