May 8, 2024, 4:41 p.m. | /u/ApartmentEither4838

Machine Learning www.reddit.com

https://preview.redd.it/z5wmyi0nb8zc1.png?width=599&format=png&auto=webp&s=97e108bd749f9cf0874759f7ba0b8aafb3260640

Today I was training a small (11.07 Million) parameter GPT model on text dataset and I came across this loss curve while training, is there any explaination as to why the loss first plataeus around 2.4 and then starts to exponentially fall there after? Also why is there a sudden spike in between at around 1200 steps?

- The dataset I am using is the entire novel "one hundred years of solitude"
- The total token count in the …

dataset gpt loss machinelearning small text training while

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Senior DevOps Engineer- Autonomous Database

@ Oracle | Reston, VA, United States