April 9, 2022, 1 p.m. | Edan Meyer

Edan Meyer www.youtube.com

Chinchilla is a massive language released by DeepMind as part of a recent paper that focuses on scaling large language models in a compute-optimal manner. It outperforms recent models like GPT-3, Gopher, and Megatron-Turing NLG that use hundreds of billions of parameters with only 70 billion parameters. They achieve this by training 400 large models to find the optimal ratio of parameters and amount of training data to train a model given a computation budget.

Outline:
0:00 - Overview
1:51 …

compute explained language language models massive

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote