Sept. 16, 2022, 1:16 a.m. | Piji Li

cs.CL updates on arXiv.org arxiv.org

The task of Chinese Spelling Check (CSC) is aiming to detect and correct
spelling errors that can be found in the text. While manually annotating a
high-quality dataset is expensive and time-consuming, thus the scale of the
training dataset is usually very small (e.g., SIGHAN15 only contains 2339
samples for training), therefore supervised-learning based models usually
suffer the data sparsity limitation and over-fitting issue, especially in the
era of big language models. In this paper, we are dedicated to investigating …

arxiv language language models unsupervised

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Management Associate

@ EcoVadis | Ebène, Mauritius

Senior Data Engineer

@ Telstra | Telstra ICC Bengaluru