April 26, 2024, 9:21 a.m. | /u/kiockete

Machine Learning www.reddit.com

In a video about ["A little guide to building Large Language Models in 2024" at 41:38](https://youtu.be/2-SPH9hIKT8?t=2498) the author starts to talk about the limits of how big the batch size can be.



>Well, if you start to have a very large batch size, the model for each optimization step makes less efficient use of each token, because the batch size is so big that each token is kind of washed out in the optimization step. And roughly, it's a …

big call kind machinelearning optimization token

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York