all AI news
Linguistically-driven Multi-task Pre-training for Low-resource Neural Machine Translation. (arXiv:2201.08070v1 [cs.CL])
cs.CL updates on arXiv.org arxiv.org
In the present study, we propose novel sequence-to-sequence pre-training
objectives for low-resource machine translation (NMT): Japanese-specific
sequence to sequence (JASS) for language pairs involving Japanese as the source
or target language, and English-specific sequence to sequence (ENSS) for
language pairs involving English. JASS focuses on masking and reordering
Japanese linguistic units known as bunsetsu, whereas ENSS is proposed based on
phrase structure masking and reordering tasks. Experiments on ASPEC
Japanese--English & Japanese--Chinese, Wikipedia Japanese--Chinese, News
English--Korean corpora demonstrate that JASS …
arxiv machine machine translation neural machine translation training translation