March 4, 2022, 2:12 a.m. | Kaiqi Fu, Shaojun Gao, Kai Wang, Wei Li, Xiaohai Tian, Zejun Ma

cs.LG updates on arXiv.org arxiv.org

Deep learning-based pronunciation scoring models highly rely on the
availability of the annotated non-native data, which is costly and has
scalability issues. To deal with the data scarcity problem, data augmentation
is commonly used for model pretraining. In this paper, we propose a phone-level
mixup, a simple yet effective data augmentation method, to improve the
performance of word-level pronunciation scoring. Specifically, given a phoneme
sequence from lexicon, the artificial augmented word sample can be generated by
randomly sampling from the …

arxiv augmentation data information phone scoring

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Engineer

@ Parker | New York City

Sr. Data Analyst | Home Solutions

@ Three Ships | Raleigh or Charlotte, NC