all AI news
AdMix: A Mixed Sample Data Augmentation Method for Neural Machine Translation. (arXiv:2205.04686v1 [cs.CL])
cs.CL updates on arXiv.org arxiv.org
In Neural Machine Translation (NMT), data augmentation methods such as
back-translation have proven their effectiveness in improving translation
performance. In this paper, we propose a novel data augmentation approach for
NMT, which is independent of any additional training data. Our approach, AdMix,
consists of two parts: 1) introduce faint discrete noise (word replacement,
word dropping, word swapping) into the original sentence pairs to form
augmented samples; 2) generate new synthetic training data by softly mixing the
augmented samples with their …
arxiv augmentation data machine machine translation mixed neural machine translation translation