Web: http://arxiv.org/abs/2205.04686

May 11, 2022, 1:11 a.m. | Chang Jin, Shigui Qiu, Nini Xiao, Hao Jia

cs.CL updates on arXiv.org arxiv.org

In Neural Machine Translation (NMT), data augmentation methods such as
back-translation have proven their effectiveness in improving translation
performance. In this paper, we propose a novel data augmentation approach for
NMT, which is independent of any additional training data. Our approach, AdMix,
consists of two parts: 1) introduce faint discrete noise (word replacement,
word dropping, word swapping) into the original sentence pairs to form
augmented samples; 2) generate new synthetic training data by softly mixing the
augmented samples with their …

arxiv augmentation data machine machine translation mixed neural neural machine translation translation

More from arxiv.org / cs.CL updates on arXiv.org

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC

Senior Data Science Writer

@ NannyML | Remote