June 26, 2024, 4:42 a.m. | Yixuan Wang, Baoxin Wang, Yijun Liu, Qingfu Zhu, Dayong Wu, Wanxiang Che

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.17456v1 Announce Type: new
Abstract: Nowadays, data augmentation through synthetic data has been widely used in the field of Grammatical Error Correction (GEC) to alleviate the problem of data scarcity. However, these synthetic data are mainly used in the pre-training phase rather than the data-limited fine-tuning phase due to inconsistent error distribution and noisy labels. In this paper, we propose a synthetic data construction method based on contextual augmentation, which can ensure an efficient augmentation of the original data with …

abstract arxiv augmentation cs.ai cs.cl data distribution error error correction fine-tuning gec however improving pre-training problem synthetic synthetic data through training tuning type via

Performance Marketing Manager

@ Jerry | New York City

Senior Growth Marketing Manager (FULLY REMOTE)

@ Jerry | Seattle, WA

Growth Marketing Channel Manager

@ Jerry | New York City

Azure Integration Developer - Consultant - Bangalore

@ KPMG India | Bengaluru, Karnataka, India

Director - Technical Program Manager

@ Capital One | Bengaluru, In

Lead Developer-Process Automation -Python Developer

@ Diageo | Bengaluru Karle Town SEZ