Nov. 10, 2023, 3:02 p.m. | /u/eeeehhh

Machine Learning www.reddit.com

I have access to a single 80Gb A100 GPU and would like to train an LLM with GPT-like architecture from scratch. Does anyone know how to calculate the maximum model size.

a100 a100 gpu architecture gpt gpu llm machinelearning memory train

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote