all AI news
Improving Scheduled Sampling with Elastic Weight Consolidation for Neural Machine Translation. (arXiv:2109.06308v2 [cs.CL] UPDATED)
cs.CL updates on arXiv.org arxiv.org
Despite strong performance in many sequence-to-sequence tasks, autoregressive
models trained with maximum likelihood estimation suffer from exposure bias,
i.e. the discrepancy between the ground-truth prefixes used during training and
the model-generated prefixes used at inference time. Scheduled sampling is a
simple and empirically successful approach which addresses this issue by
incorporating model-generated prefixes into training. However, it has been
argued that it is an inconsistent training objective leading to models ignoring
the prefixes altogether. In this paper, we conduct systematic …
arxiv consolidation machine machine translation neural machine translation sampling translation