Oct. 27, 2022, 1:16 a.m. | Xuan-Phi Nguyen, Sravya Popuri, Changhan Wang, Yun Tang, Ilia Kulikov, Hongyu Gong

cs.CL updates on arXiv.org arxiv.org

Direct speech-to-speech translation (S2ST) is among the most challenging
problems in the translation paradigm due to the significant scarcity of S2ST
data. While effort has been made to increase the data size from unlabeled
speech by cascading pretrained speech recognition (ASR), machine translation
(MT) and text-to-speech (TTS) models; unlabeled text has remained relatively
under-utilized to improve S2ST. We propose an effective way to utilize the
massive existing unlabeled text from different languages to create a large
amount of S2ST data …

arxiv speech speech-to-speech translation text translation

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US