Improving language models fine-tuning with representation consistency targets. (arXiv:2205.11603v1 [cs.CL]) | allainews.com

May 25, 2022, 1:11 a.m. | Anastasia Razdaibiedina, Vivek Madan, Zohar Karnin, Ashish Khetan, Vishaal Kapoor

cs.CL updates on arXiv.org arxiv.org

Fine-tuning contextualized representations learned by pre-trained language
models has become a standard practice in the NLP field. However, pre-trained
representations are prone to degradation (also known as representation
collapse) during fine-tuning, which leads to instability, suboptimal
performance, and weak generalization. In this paper, we propose a novel
fine-tuning method that avoids representation collapse during fine-tuning by
discouraging undesirable changes in the representations. We show that our
approach matches or exceeds the performance of the existing
regularization-based fine-tuning methods across 13 …

arxiv fine-tuning language language models representation

More from arxiv.org / cs.CL updates on arXiv.org

A Survey of Graph Meets Large Language Model: Progress and Future Directions 7 hours ago | arxiv.org

arxiv cs.cl cs.lg cs.si +9

Never Train from Scratch: Fair Comparison of Long-Sequence Models Requires Data-Driven Priors 7 hours ago | arxiv.org

abstract architectures arxiv benchmarks +18

LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools and Self-Explanations 7 hours ago | arxiv.org

abstract arxiv conversational cs.ai +17

DP-NMT: Scalable Differentially-Private Machine Translation 7 hours ago | arxiv.org

abstract arxiv concerns concrete +22

DEFT: Data Efficient Fine-Tuning for Pre-Trained Language Models via Unsupervised Core-Set Selection 7 hours ago | arxiv.org

abstract advances arxiv availability +16

RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models 7 hours ago | arxiv.org

abstract art arxiv benchmarking +21

Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench 7 hours ago | arxiv.org

arxiv cs.cl llms type

Noise-Robust De-Duplication at Scale 7 hours ago | arxiv.org

abstract applications articles arxiv +18

ICDM 2020 Knowledge Graph Contest: Consumer Event-Cause Extraction 7 hours ago | arxiv.org

abstract applications arxiv attention +16

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Vice President, Data Science, Marketplace

@ Xometry | North Bethesda, Maryland, Lexington, KY, Remote

View on ai-jobs.net

Field Solutions Developer IV, Generative AI, Google Cloud

@ Google | Toronto, ON, Canada; Atlanta, GA, USA

View on ai-jobs.net