Web: http://arxiv.org/abs/2206.08446

June 20, 2022, 1:12 a.m. | Michal Štefánik

cs.CL updates on arXiv.org arxiv.org

Despite their outstanding performance, large language models (LLMs) suffer
notorious flaws related to their preference for simple, surface-level textual
relations over full semantic complexity of the problem. This proposal
investigates a common denominator of this problem in their weak ability to
generalise outside of the training domain. We survey diverse research
directions providing estimations of model generalisation ability and find that
incorporating some of these measures in the training objectives leads to
enhanced distributional robustness of neural models. Based on …

arxiv language language models models robustness

More from arxiv.org / cs.CL updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY