Feb. 7, 2024, 5:48 a.m. | Haoran Xu Young Jin Kim Amr Sharaf Hany Hassan Awadalla

cs.CL updates on arXiv.org arxiv.org

Generative Large Language Models (LLMs) have achieved remarkable advancements in various NLP tasks. However, these advances have not been reflected in the translation task, especially those with moderate model sizes (i.e., 7B or 13B parameters), which still lag behind conventional supervised encoder-decoder translation models. Previous studies have attempted to improve the translation capabilities of these moderate LLMs, but their gains have been limited. In this study, we propose a novel fine-tuning approach for LLMs that is specifically designed for the …

13b advances boosting cs.cl decoder encoder encoder-decoder generative language language models large language large language models llms machine machine translation nlp paradigm parameters performance shift studies tasks translation

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US