Web: http://arxiv.org/abs/2112.11446

Jan. 24, 2022, 2:10 a.m. | Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Youn

cs.CL updates on arXiv.org arxiv.org

Language modelling provides a step towards intelligent communication systems
by harnessing large repositories of written human knowledge to better predict
and understand the world. In this paper, we present an analysis of
Transformer-based language model performance across a wide range of model
scales -- from models with tens of millions of parameters up to a 280 billion
parameter model called Gopher. These models are evaluated on 152 diverse tasks,
achieving state-of-the-art performance across the majority. Gains from scale
are largest …

analysis arxiv insights language language models models scaling training

More from arxiv.org / cs.CL updates on arXiv.org

Data Scientist

@ Fluent, LLC | Boca Raton, Florida, United States

Big Data ETL Engineer

@ Binance.US | Vancouver

Data Scientist / Data Engineer

@ Kin + Carta | Chicago

Data Engineer

@ Craft | Warsaw, Masovian Voivodeship, Poland

Senior Manager, Data Analytics Audit

@ Affirm | Remote US

Data Scientist - Nationwide Opportunities, AWS Professional Services

@ Amazon.com | US, NC, Virtual Location - N Carolina