all AI news
OmniKnight: Multilingual Neural Machine Translation with Language-Specific Self-Distillation. (arXiv:2205.01620v1 [cs.CL])
cs.CL updates on arXiv.org arxiv.org
Although all-in-one-model multilingual neural machine translation (MNMT) has
achieved remarkable progress in recent years, its selected best overall
checkpoint fails to achieve the best performance simultaneously in all language
pairs. It is because that the best checkpoints for each individual language
pair (i.e., language-specific best checkpoints) scatter in different epochs. In
this paper, we present a novel training strategy dubbed Language-Specific
Self-Distillation (LSSD) for bridging the gap between language-specific best
checkpoints and the overall best checkpoint. In detail, we regard …
arxiv distillation language machine machine translation neural machine translation translation