Web: http://arxiv.org/abs/2205.04810

May 11, 2022, 1:11 a.m. | Lukas Edman, Antonio Toral, Gertjan van Noord

cs.CL updates on arXiv.org arxiv.org

This paper investigates very low resource language model pretraining, when
less than 100 thousand sentences are available. We find that, in very low
resource scenarios, statistical n-gram language models outperform
state-of-the-art neural models. Our experiments show that this is mainly due to
the focus of the former on a local context. As such, we introduce three methods
to improve a neural model's performance in the low-resource setting, finding
that limiting the model's self-attention is the most effective one, improving
on …

arxiv context language modeling

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC