Web: http://arxiv.org/abs/2205.02670

May 6, 2022, 1:11 a.m. | Wei Wei, Huang Hengguan, Gu Xiangming, Wang Hao, Wang Ye

cs.LG updates on arXiv.org arxiv.org

Content mismatch usually occurs when data from one modality is translated to
another, e.g. language learners producing mispronunciations (errors in speech)
when reading a sentence (target text) aloud. However, most existing alignment
algorithms assume the content involved in the two modalities is perfectly
matched and thus leading to difficulty in locating such mismatch between speech
and text. In this work, we develop an unsupervised learning algorithm that can
infer the relationship between content-mismatched cross-modal sequential data,
especially for speech-text sequences. …

arxiv cross data localization unsupervised

More from arxiv.org / cs.LG updates on arXiv.org

Predictive Ecology Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL