Web: http://arxiv.org/abs/2112.08352

May 6, 2022, 1:12 a.m. | Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning H

cs.LG updates on arXiv.org arxiv.org

We present a textless speech-to-speech translation (S2ST) system that can
translate speech from one language into another language and can be built
without the need of any text data. Different from existing work in the
literature, we tackle the challenge in modeling multi-speaker target speech and
train the systems with real-world S2ST data. The key to our approach is a
self-supervised unit-based speech normalization technique, which finetunes a
pre-trained speech encoder with paired audios from multiple speakers and a
single …

arxiv data on speech translation

More from arxiv.org / cs.LG updates on arXiv.org

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC

Senior Data Science Writer

@ NannyML | Remote

Director of AI/ML Engineering

@ Armis Industries | Remote (US only), St. Louis, California

Digital Analytics Manager

@ Patagonia | Ventura, California