all AI news
Think Twice: Measuring the Efficiency of Eliminating Prediction Shortcuts of Question Answering Models
Feb. 7, 2024, 5:48 a.m. | Luk\'a\v{s} Mikula Michal \v{S}tef\'anik Marek Petrovi\v{c} Petr Sojka
cs.CL updates on arXiv.org arxiv.org
We propose a simple method for measuring a scale of models' reliance on any identified spurious feature and assess the robustness towards …
authors correlations cs.ai cs.cl datasets distribution efficiency language language models language understanding large language large language models llms measuring modelling model robustness prediction question question answering robustness shows tasks think training understanding work
More from arxiv.org / cs.CL updates on arXiv.org
Benchmarking LLMs via Uncertainty Quantification
2 days, 11 hours ago |
arxiv.org
CARE: Extracting Experimental Findings From Clinical Literature
2 days, 11 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Research Scientist, Demography and Survey Science, University Grad
@ Meta | Menlo Park, CA | New York City
Computer Vision Engineer, XR
@ Meta | Burlingame, CA