all AI news
CoCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation Detection and Diagnosis. (arXiv:2111.08191v2 [cs.CL] UPDATED)
June 30, 2022, 1:12 a.m. | Nianzu Zheng, Liqun Deng, Wenyong Huang, Yu Ting Yeung, Baohua Xu, Yuanyuan Guo, Yasheng Wang, Xiao Chen, Xin Jiang, Qun Liu
cs.CL updates on arXiv.org arxiv.org
Mispronunciation detection and diagnosis (MDD) is a popular research focus in
computer-aided pronunciation training (CAPT) systems. End-to-end (e2e)
approaches are becoming dominant in MDD. However an e2e MDD model usually
requires entire speech utterances as input context, which leads to significant
time latency especially for long paragraphs. We propose a streaming e2e MDD
model called CoCA-MDD. We utilize conv-transformer structure to encode input
speech in a streaming manner. A coupled cross-attention (CoCA) mechanism is
proposed to integrate frame-level acoustic features …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Praktikum im Bereich eMobility / Charging Solutions - Data Analysis
@ Bosch Group | Stuttgart, Germany
Business Data Analyst
@ PartnerRe | Toronto, ON, Canada
Machine Learning/DevOps Engineer II
@ Extend | Remote, United States
Business Intelligence Developer, Marketing team (Bangkok based, relocation provided)
@ Agoda | Bangkok (Central World)