April 7, 2022, 1:11 a.m. | Yogesh Virkar, Marcello Federico, Robert Enyedi, Roberto Barra-Chicote

cs.LG updates on arXiv.org arxiv.org

The goal of automatic dubbing is to perform speech-to-speech translation
while achieving audiovisual coherence. This entails isochrony, i.e.,
translating the original speech by also matching its prosodic structure into
phrases and pauses, especially when the speaker's mouth is visible. In previous
work, we introduced a prosodic alignment model to address isochrone or
on-screen dubbing. In this work, we extend the prosodic alignment model to also
address off-screen dubbing that requires less stringent synchronization
constraints. We conduct experiments on four dubbing …

arxiv dubbing

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

DevOps Engineer (Data Team)

@ Reward Gateway | Sofia/Plovdiv