A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models | allainews.com

Feb. 7, 2024, 5:48 a.m. | Haoran Xu Young Jin Kim Amr Sharaf Hany Hassan Awadalla

cs.CL updates on arXiv.org arxiv.org

Generative Large Language Models (LLMs) have achieved remarkable advancements in various NLP tasks. However, these advances have not been reflected in the translation task, especially those with moderate model sizes (i.e., 7B or 13B parameters), which still lag behind conventional supervised encoder-decoder translation models. Previous studies have attempted to improve the translation capabilities of these moderate LLMs, but their gains have been limited. In this study, we propose a novel fine-tuning approach for LLMs that is specifically designed for the …

13b advances boosting cs.cl decoder encoder encoder-decoder generative language language models large language large language models llms machine machine translation nlp paradigm parameters performance shift studies tasks translation

More from arxiv.org / cs.CL updates on arXiv.org

CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators 12 hours ago | arxiv.org

abstract accelerators architectures arxiv +13

CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning 12 hours ago | arxiv.org

arxiv benchmark chinese cs.ai +8

Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models 12 hours ago | arxiv.org

abstract advances arxiv cs.cl +16

An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT 12 hours ago | arxiv.org

abstract arxiv chatgpt communication +14

Commentary Generation from Data Records of Multiplayer Strategy Esports Game 12 hours ago | arxiv.org

abstract arxiv audience become +20

Honeyfile Camouflage: Hiding Fake Files in Plain Sight 12 hours ago | arxiv.org

abstract arxiv challenge cosine +13

You Only Cache Once: Decoder-Decoder Architectures for Language Models 12 hours ago | arxiv.org

architectures arxiv cache cs.cl +4

Open Source Language Models Can Provide Feedback: Evaluating LLMs' Ability to Help Students Using GPT-4-As-A-Judge 12 hours ago | arxiv.org

abstract arxiv computing concerns +23

LLMs with Personalities in Multi-issue Negotiation Games 12 hours ago | arxiv.org

abstract agents ai agents arxiv +26

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net