Jan. 14, 2022, 2:10 a.m. | Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Phillip Koehn, Juan Pino

cs.CL updates on arXiv.org arxiv.org

We present a direct simultaneous speech-to-speech translation (Simul-S2ST)
model, Furthermore, the generation of translation is independent from
intermediate text representations. Our approach leverages recent progress on
direct speech-to-speech translation with discrete units, in which a sequence of
discrete representations, instead of continuous spectrogram features, learned
in an unsupervised manner, are predicted from the model and passed directly to
a vocoder for speech synthesis on-the-fly. We also introduce the variational
monotonic multihead attention (V-MMA), to handle the challenge of inefficient
policy …

arxiv attention speech translation

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Applied Scientist, Control Stack, AWS Center for Quantum Computing

@ Amazon.com | Pasadena, California, USA

Specialist Marketing with focus on ADAS/AD f/m/d

@ AVL | Graz, AT

Machine Learning Engineer, PhD Intern

@ Instacart | United States - Remote

Supervisor, Breast Imaging, Prostate Center, Ultrasound

@ University Health Network | Toronto, ON, Canada

Senior Manager of Data Science (Recommendation Science)

@ NBCUniversal | New York, NEW YORK, United States