Feb. 26, 2024, 5:42 a.m. | Fr\'ed\'eric Piedboeuf, Philippe Langlais

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.14895v1 Announce Type: cross
Abstract: Textual data augmentation (DA) is a prolific field of study where novel techniques to create artificial data are regularly proposed, and that has demonstrated great efficiency on small data settings, at least for text classification tasks. In this paper, we challenge those results, showing that classical data augmentation is simply a way of performing better fine-tuning, and that spending more time fine-tuning before applying data augmentation negates its effect. This is a significant contribution as …

abstract artificial arxiv augmentation challenge classification cs.ai cs.cl cs.lg data efficiency least live data novel paper prolific results small small data study tasks text text classification textual type

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne