all AI news
SDA: Simple Discrete Augmentation for Contrastive Sentence Representation Learning
March 21, 2024, 4:48 a.m. | Dongsheng Zhu, Zhenyu Mao, Jinghui Lu, Rui Zhao, Fei Tan
cs.CL updates on arXiv.org arxiv.org
Abstract: Contrastive learning has recently achieved compelling performance in unsupervised sentence representation. As an essential element, data augmentation protocols, however, have not been well explored. The pioneering work SimCSE resorting to a simple dropout mechanism (viewed as continuous augmentation) surprisingly dominates discrete augmentations such as cropping, word deletion, and synonym replacement as reported. To understand the underlying rationales, we revisit existing approaches and attempt to hypothesize the desiderata of reasonable data augmentation methods: balance of semantic …
abstract arxiv augmentation continuous cs.cl data dropout element however performance representation representation learning simple type unsupervised word work
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne