Feb. 16, 2024, 5:47 a.m. | Xinran Chen, Sufeng Duan, Gongshen Liu

cs.CL updates on arXiv.org arxiv.org

arXiv:2402.09725v1 Announce Type: new
Abstract: Being one of the IR-NAT (Iterative-refinemennt-based NAT) frameworks, the Conditional Masked Language Model (CMLM) adopts the mask-predict paradigm to re-predict the masked low-confidence tokens. However, CMLM suffers from the data distribution discrepancy between training and inference, where the observed tokens are generated differently in the two cases. In this paper, we address this problem with the training approaches of error exposure and consistency regularization (EECR). We construct the mixed sequences based on model prediction during …

abstract arxiv confidence cs.ai cs.cl data distribution error frameworks generated inference iterative language language model low machine machine translation nat paradigm regularization tokens training translation type

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York