Web: http://arxiv.org/abs/2011.12649

Jan. 27, 2022, 2:10 a.m. | Martijn Bartelds, Wietse de Vries, Faraz Sanal, Caitlin Richter, Mark Liberman, Martijn Wieling

cs.CL updates on arXiv.org arxiv.org

Variation in speech is often quantified by comparing phonetic transcriptions
of the same utterance. However, manually transcribing speech is time-consuming
and error prone. As an alternative, therefore, we investigate the extraction of
acoustic embeddings from several self-supervised neural models. We use these
representations to compute word-based pronunciation differences between
non-native and native speakers of English, and between Norwegian dialect
speakers. For comparison with several earlier studies, we evaluate how well
these differences match human perception by comparing them with available …

arxiv modeling neural speech

More from arxiv.org / cs.CL updates on arXiv.org

Data Scientist

@ Fluent, LLC | Boca Raton, Florida, United States

Big Data ETL Engineer

@ Binance.US | Vancouver

Data Scientist / Data Engineer

@ Kin + Carta | Chicago

Data Engineer

@ Craft | Warsaw, Masovian Voivodeship, Poland

Senior Manager, Data Analytics Audit

@ Affirm | Remote US

Data Scientist - Nationwide Opportunities, AWS Professional Services

@ Amazon.com | US, NC, Virtual Location - N Carolina