Fine-tuning Large Language Models with Sequential Instructions | allainews.com

March 13, 2024, 4:47 a.m. | Hanxu Hu, Pinzhen Chen, Edoardo M. Ponti

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.07794v1 Announce Type: new
Abstract: Large language models (LLMs) struggle to follow a sequence of instructions in a single query as they may ignore or misinterpret part of it. This impairs their performance in complex problems whose solution requires multiple intermediate steps, such as multilingual (translate then answer) and multimodal (caption then answer) tasks. We empirically verify this with open-source LLMs as large as LLaMA-2 70B and Mixtral-8x7B. Targeting the scarcity of sequential instructions in present-day data, we propose sequential …

abstract arxiv cs.cl fine-tuning intermediate language language models large language large language models llms multilingual multimodal multiple part performance query solution struggle translate type

More from arxiv.org / cs.CL updates on arXiv.org

Sparse is Enough in Fine-tuning Pre-trained Large Language Models 1 day, 1 hour ago | arxiv.org

arxiv cs.ai cs.cl cs.lg +6

On the Learnability of Watermarks for Language Models 1 day, 1 hour ago | arxiv.org

abstract arxiv cs.cl cs.cr +17

StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization 1 day, 1 hour ago | arxiv.org

abstract arxiv capabilities cs.ai +14

Evaluating Generative Ad Hoc Information Retrieval 1 day, 1 hour ago | arxiv.org

abstract advances arxiv cs.cl +19

Language Models As Semantic Indexers 1 day, 1 hour ago | arxiv.org

arxiv cs.cl cs.ir cs.lg +4

Large language models can accurately predict searcher preferences 1 day, 1 hour ago | arxiv.org

abstract arxiv cs.ai cs.cl +16

On the Reliability of Watermarks for Large Language Models 1 day, 1 hour ago | arxiv.org

abstract arxiv become bots +28

A Watermark for Large Language Models 1 day, 1 hour ago | arxiv.org

abstract arxiv cs.cl cs.cr +16

CreoleVal: Multilingual Multitask Benchmarks for Creoles 1 day, 1 hour ago | arxiv.org

abstract annotated data arxiv benchmarks +14

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)

@ takealot.com | Cape Town

View on ai-jobs.net