Web: http://arxiv.org/abs/2209.06581

Sept. 15, 2022, 1:11 a.m. | H.A.Z. Sameen Shahgir, Khondker Salman Sayeed, Tanjeem Azwad Zaman

cs.LG updates on arXiv.org arxiv.org

Speech is inherently continuous, where discrete words, phonemes and other
units are not clearly segmented, and so speech recognition has been an active
research problem for decades. In this work we have fine-tuned wav2vec 2.0 to
recognize and transcribe Bengali speech -- training it on the Bengali Common
Voice Speech Dataset. After training for 71 epochs, on a training set
consisting of 36919 mp3 files, we achieved a training loss of 0.3172 and WER of
0.2524 on a validation set …

arxiv dataset speech speech recognition

More from arxiv.org / cs.LG updates on arXiv.org

Research Scientists

@ ODU Research Foundation | Norfolk, Virginia

Embedded Systems Engineer (Robotics)

@ Neo Cybernetica | Bedford, New Hampshire

2023 Luis J. Alvarez and Admiral Grace M. Hopper Postdoc Fellowship in Computing Sciences

@ Lawrence Berkeley National Lab | San Francisco, CA

Senior Manager Data Scientist

@ NAV | Remote, US

Senior AI Research Scientist

@ Earth Species Project | Remote anywhere

Research Fellow- Center for Security and Emerging Technology (Multiple Opportunities)

@ University of California Davis | Washington, DC

Staff Fellow - Data Scientist

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Staff Fellow - Senior Data Engineer

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Tech Business Data Analyst

@ Fivesky | Alpharetta, GA

Senior Applied Scientist

@ Amazon.com | London, England, GBR

AI Researcher (Junior/Mid-level)

@ Charles River Analytics Inc. | Cambridge, MA

Data Engineer - Machine Learning & AI

@ Calabrio | Minneapolis, Minnesota, United States