all AI news
When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems. (arXiv:2205.12228v1 [cs.CL])
May 25, 2022, 1:12 a.m. | Elias Stengel-Eskin, Emmanouil Antonios Platanios, Adam Pauls, Sam Thomson, Hao Fang, Benjamin Van Durme, Jason Eisner, Yu Su
cs.CL updates on arXiv.org arxiv.org
In natural language understanding (NLU) production systems, users' evolving
needs necessitate the addition of new features over time, indexed by new
symbols added to the meaning representation space. This requires additional
training data and results in ever-growing datasets. We present the first
systematic investigation into this incremental symbol learning scenario. Our
analyses reveal a troubling quirk in building (broad-coverage) NLU systems: as
the training dataset grows, more data is needed to learn new symbols, forming a
vicious cycle. We show …
arxiv data language natural natural language systems understanding
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Data Architect
@ Western Digital | San Jose, CA, United States
Senior Data Scientist GenAI (m/w/d)
@ Deutsche Telekom | Bonn, Deutschland
Senior Data Engineer, Telco (Remote)
@ Lightci | Toronto, Ontario
Consultant Data Architect/Engineer H/F - Innovative Tech
@ Devoteam | Lyon, France
(Senior) ML Engineer / Software Engineer Machine Learning & AI (m/f/x) onsite or remote (in Germany or Austria)
@ Scalable GmbH | Wien, Germany