all AI news
Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages. (arXiv:2208.12666v1 [cs.CL])
Aug. 29, 2022, 1:13 a.m. | Kaushal Santosh Bhogale, Abhigyan Raman, Tahir Javed, Sumanth Doddapaneni, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra
cs.CL updates on arXiv.org arxiv.org
End-to-end (E2E) models have become the default choice for state-of-the-art
speech recognition systems. Such models are trained on large amounts of
labelled data, which are often not available for low-resource languages.
Techniques such as self-supervised learning and transfer learning hold promise,
but have not yet been effective in training accurate models. On the other hand,
collecting labelled datasets on a diverse set of domains and speakers is very
expensive. In this work, we demonstrate an inexpensive and effective
alternative to …
More from arxiv.org / cs.CL updates on arXiv.org
Benchmarking LLMs via Uncertainty Quantification
2 days, 3 hours ago |
arxiv.org
CARE: Extracting Experimental Findings From Clinical Literature
2 days, 3 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Research Scientist
@ Meta | Menlo Park, CA
Principal Data Scientist
@ Mastercard | O'Fallon, Missouri (Main Campus)