all AI news
Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information. (arXiv:2110.08420v2 [cs.CL] UPDATED)
Web: http://arxiv.org/abs/2110.08420
June 16, 2022, 1:12 a.m. | Kawin Ethayarajh, Yejin Choi, Swabha Swayamdipta
cs.CL updates on arXiv.org arxiv.org
Estimating the difficulty of a dataset typically involves comparing
state-of-the-art models to humans; the bigger the performance gap, the harder
the dataset is said to be. However, this comparison provides little
understanding of how difficult each instance in a given distribution is, or
what attributes make the dataset difficult for a given model. To address these
questions, we frame dataset difficulty -- w.r.t. a model $\mathcal{V}$ -- as
the lack of $\mathcal{V}$-$\textit{usable information}$ (Xu et al., 2019),
where a lower …
More from arxiv.org / cs.CL updates on arXiv.org
Latest AI/ML/Big Data Jobs
Machine Learning Researcher - Saalfeld Lab
@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia
Project Director, Machine Learning in US Health
@ ideas42.org | Remote, US
Data Science Intern
@ NannyML | Remote
Machine Learning Engineer NLP/Speech
@ Play.ht | Remote
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY