March 17, 2024, 8:38 p.m. | /u/we_are_mammals

Machine Learning www.reddit.com

> We are releasing the base model weights and network architecture of Grok-1, our large language model. Grok-1 is a 314 billion parameter Mixture-of-Experts model trained from scratch by xAI.

> This is the raw base model checkpoint from the Grok-1 pre-training phase, which concluded in October 2023. This means that the model is not fine-tuned for any specific application, such as dialogue.

> We are releasing the weights and the architecture under the Apache 2.0 license.

> To get …

architecture billion checkpoint experts grok language language model large language large language model machinelearning network network architecture pre-training raw releases scratch training xai

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote