Feb. 9, 2024, 5:42 a.m. | Junhong Shen Neil Tenenholtz James Brian Hall David Alvarez-Melis Nicolo Fusi

cs.LG updates on arXiv.org arxiv.org

Large Language Models (LLMs) have demonstrated remarkable proficiency in understanding and generating natural language. However, their capabilities wane in highly specialized domains underrepresented in the pretraining corpus, such as physical and biomedical sciences. This work explores how to repurpose general LLMs into effective task solvers for specialized domains. We introduce a novel, model-agnostic framework for learning custom input tags, which are parameterized as continuous vectors appended to the LLM's embedding layer, to condition the LLM. We design two types of …

biomedical capabilities cs.ai cs.cl cs.lg domains general language language models large language large language models llm llms model-agnostic natural natural language novel pretraining tag understanding work

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

AI Engineering Manager

@ M47 Labs | Barcelona, Catalunya [Cataluña], Spain