Feb. 12, 2024, 5:43 a.m. | Parsa Kavehzadeh Mojtaba Valipour Marzieh Tahaei Ali Ghodsi Boxing Chen Mehdi Rezagholizadeh

cs.LG updates on arXiv.org arxiv.org

Large language models (LLMs) have revolutionized natural language processing (NLP) by excelling at understanding and generating human-like text. However, their widespread deployment can be prohibitively expensive. SortedNet is a recent training technique for enabling dynamic inference by leveraging the modularity in networks and sorting sub-models based on computation/accuracy in a nested manner. We extend SortedNet to generative NLP tasks, making large language models dynamic without any Pre-Training and by only replacing Standard Fine-Tuning (SFT) with Sorted Fine-Tuning (SoFT). Our approach …

cs.cl cs.lg deployment dynamic enabling human human-like inference intermediate language language models language processing large language large language models llama llms natural natural language natural language processing networks nlp processing sorting text training understanding

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne