Nov. 15, 2022, 2:16 a.m. | Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hir

cs.CL updates on arXiv.org arxiv.org

We study speech-to-speech translation (S2ST) that translates speech from one
language into another language and focuses on building systems to support
languages without standard text writing systems. We use English-Taiwanese
Hokkien as a case study, and present an end-to-end solution from training data
collection, modeling choices to benchmark dataset release. First, we present
efforts on creating human annotated data, automatically mining data from large
unlabeled speech datasets, and adopting pseudo-labeling to produce weakly
supervised data. On the modeling, we take …

arxiv language speech speech-to-speech translation translation

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US