May 8, 2024, 4:41 p.m. | /u/ApartmentEither4838

Machine Learning www.reddit.com

https://preview.redd.it/z5wmyi0nb8zc1.png?width=599&format=png&auto=webp&s=97e108bd749f9cf0874759f7ba0b8aafb3260640

Today I was training a small (11.07 Million) parameter GPT model on text dataset and I came across this loss curve while training, is there any explaination as to why the loss first plataeus around 2.4 and then starts to exponentially fall there after? Also why is there a sudden spike in between at around 1200 steps?

- The dataset I am using is the entire novel "one hundred years of solitude"
- The total token count in the …

dataset gpt loss machinelearning small text training while

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US