Web: http://arxiv.org/abs/2111.06310

June 20, 2022, 1:12 a.m. | Zijian Yang, Yingbo Gao, Alexander Gerstenberger, Jintao Jiang, Ralf Schlüter, Hermann Ney

cs.CL updates on arXiv.org arxiv.org

To mitigate the problem of having to traverse over the full vocabulary in the
softmax normalization of a neural language model, sampling-based training
criteria are proposed and investigated in the context of large vocabulary
word-based neural language models. These training criteria typically enjoy the
benefit of faster training and testing, at a cost of slightly degraded
performance in terms of perplexity and almost no visible drop in word error
rate. While noise contrastive estimation is one of the most popular …

arxiv language modeling neural sampling

More from arxiv.org / cs.CL updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY