BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. (arXiv:2109.13226v3 [eess.AS] UPDATED) | allainews.com

July 25, 2022, 1:12 a.m. | Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min

cs.CL updates on arXiv.org arxiv.org

We summarize the results of a host of efforts using giant automatic speech
recognition (ASR) models pre-trained using large, diverse unlabeled datasets
containing approximately a million hours of audio. We find that the combination
of pre-training, self-training and scaling up model size greatly increases data
efficiency, even for extremely large tasks with tens of thousands of hours of
labeled data. In particular, on an ASR task with 34k hours of labeled data, by
fine-tuning an 8 billion parameter pre-trained Conformer …

arxiv automatic speech recognition learning scale semi-supervised semi-supervised learning speech speech recognition supervised learning

More from arxiv.org / cs.CL updates on arXiv.org

Gradient Flow of Energy: A General and Efficient Approach for Entity Alignment Decoding 6 hours ago | arxiv.org

abstract alignment arxiv cs.cl +19

Recommender Systems in the Era of Large Language Models (LLMs) 6 hours ago | arxiv.org

abstract applications arxiv become +23

EE-TTS: Emphatic Expressive TTS with Linguistic Information 6 hours ago | arxiv.org

abstract arxiv attention challenge +12

Raidar: geneRative AI Detection viA Rewriting 6 hours ago | arxiv.org

abstract ai detection ai-generated content ai-generated text +18

GeoGalactica: A Scientific Large Language Model in Geoscience 6 hours ago | arxiv.org

abstract applications arxiv cs.cl +25

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning 6 hours ago | arxiv.org

arxiv cs.ai cs.cl multimodal +3

ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models 6 hours ago | arxiv.org

arxiv cs.cl documents language +5

AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs 6 hours ago | arxiv.org

abstract arxiv audio capabilities +17

Adapting Fake News Detection to the Era of Large Language Models 6 hours ago | arxiv.org

abstract adoption age arxiv +18

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

Research Analyst

@ Cypris | Los Angeles, California, United States

View on ai-jobs.net

Data Manager H/F

@ ASSYSTEM | Courbevoie, France

View on ai-jobs.net

Software Engineer III - Java Scala BigData AWS

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India

View on ai-jobs.net

Reference Data Specialist

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India

View on ai-jobs.net

Data Visualization Manager

@ PatientPoint | Cincinnati, Ohio, United States

View on ai-jobs.net