June 15, 2024, 12:51 a.m. | Pranav Jadhav

Towards Data Science - Medium towardsdatascience.com

Define and train GPT-2 on your MacBook

Photo by Sergey Zolkin on Unsplash

My goal with this post is to walk you through defining and training GPT-2 from scratch with MLX, Apple’s machine-learning library for Apple silicon. I want to leave no stone unturned from tokenizer to sampling. In the spirit of Karpathy’s excellent GPT from scratch tutorial, we will train a model on the works of Shakespeare [1]. We will start with a blank Python file and …

apple apple silicon deep-dives deep learning gpt gpt-2 library llm machine mlx nlp sampling scratch sergey silicon stone through train training you

Senior Data Engineer

@ Displate | Warsaw

Professor/Associate Professor of Health Informatics [LKCMedicine]

@ Nanyang Technological University | NTU Novena Campus, Singapore

Research Fellow (Computer Science (and Engineering)/Electronic Engineering/Applied Mathematics/Perception Sciences)

@ Nanyang Technological University | NTU Main Campus, Singapore

Java Developer - Assistant Manager

@ State Street | Bengaluru, India

Senior Java/Python Developer

@ General Motors | Austin IT Innovation Center North - Austin IT Innovation Center North

Research Associate (Computer Engineering/Computer Science/Electronics Engineering)

@ Nanyang Technological University | NTU Main Campus, Singapore