June 30, 2022, 1:12 a.m. | Nianzu Zheng, Liqun Deng, Wenyong Huang, Yu Ting Yeung, Baohua Xu, Yuanyuan Guo, Yasheng Wang, Xiao Chen, Xin Jiang, Qun Liu

cs.CL updates on arXiv.org arxiv.org

Mispronunciation detection and diagnosis (MDD) is a popular research focus in
computer-aided pronunciation training (CAPT) systems. End-to-end (e2e)
approaches are becoming dominant in MDD. However an e2e MDD model usually
requires entire speech utterances as input context, which leads to significant
time latency especially for long paragraphs. We propose a streaming e2e MDD
model called CoCA-MDD. We utilize conv-transformer structure to encode input
speech in a streaming manner. A coupled cross-attention (CoCA) mechanism is
proposed to integrate frame-level acoustic features …

arxiv attention detection diagnosis framework streaming

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Praktikum im Bereich eMobility / Charging Solutions - Data Analysis

@ Bosch Group | Stuttgart, Germany

Business Data Analyst

@ PartnerRe | Toronto, ON, Canada

Machine Learning/DevOps Engineer II

@ Extend | Remote, United States

Business Intelligence Developer, Marketing team (Bangkok based, relocation provided)

@ Agoda | Bangkok (Central World)