March 26, 2024, 4:52 a.m. | Yihong Liu, Peiqin Lin, Mingyang Wang, Hinrich Sch\"utze

cs.CL updates on arXiv.org arxiv.org

arXiv:2311.08849v2 Announce Type: replace
Abstract: Instead of pretraining multilingual language models from scratch, a more efficient method is to adapt existing pretrained language models (PLMs) to new languages via vocabulary extension and continued pretraining. However, this method usually randomly initializes the embeddings of new subwords and introduces substantially more embedding parameters to the model, thus weakening the efficiency. To address these issues, we propose a novel framework: $\textbf{O}$ne $\textbf{F}$or $\textbf{A}$ll ($\textbf{OFA}$), which wisely initializes the embeddings of unseen subwords and …

abstract adapt arxiv cs.cl embeddings extension framework however language language models languages multilingual pretraining scale scratch type via

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote