Feb. 7, 2024, 5:48 a.m. | Liang-Hsuan Tseng En-Pei Hu Cheng-Han Chiang Yuan Tseng Hung-yi Lee Lin-shan Lee Shao-Hua Sun

cs.CL updates on arXiv.org arxiv.org

Unsupervised automatic speech recognition (ASR) aims to learn the mapping between the speech signal and its corresponding textual transcription without the supervision of paired speech-text data. A word/phoneme in the speech signal is represented by a segment of speech signal with variable length and unknown boundary, and this segmental structure makes learning the mapping between speech and text challenging, especially without paired data. In this paper, we propose REBORN, Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR. REBORN alternates …

asr automatic speech recognition cs.cl cs.sd data eess.as iterative learn mapping recognition reinforcement segment segmentation signal speech speech recognition supervision text textual training transcription unsupervised word

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

#13721 - Data Engineer - AI Model Testing

@ Qualitest | Miami, Florida, United States

Elasticsearch Administrator

@ ManTech | 201BF - Customer Site, Chantilly, VA