May 11, 2022, 1:11 a.m. | Chang Jin, Shigui Qiu, Nini Xiao, Hao Jia

cs.CL updates on arXiv.org arxiv.org

In Neural Machine Translation (NMT), data augmentation methods such as
back-translation have proven their effectiveness in improving translation
performance. In this paper, we propose a novel data augmentation approach for
NMT, which is independent of any additional training data. Our approach, AdMix,
consists of two parts: 1) introduce faint discrete noise (word replacement,
word dropping, word swapping) into the original sentence pairs to form
augmented samples; 2) generate new synthetic training data by softly mixing the
augmented samples with their …

arxiv augmentation data machine machine translation mixed neural machine translation translation

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analytics & Insight Specialist, Customer Success

@ Fortinet | Ottawa, ON, Canada

Account Director, ChatGPT Enterprise - Majors

@ OpenAI | Remote - Paris