Sept. 3, 2023, 12:56 p.m. | /u/Pan000

Machine Learning www.reddit.com

I'm the author of [TokenMonster](https://github.com/alasdairforsythe/tokenmonster), a free open-source tokenizer and vocabulary builder. I've posted on here a few times as the project has evolved, and each time I'm asked "have you tested it on a language model?".

Well here it is. I spent $8,000 from my own pocket, and 2 months, pretraining from scratch, finetuning and evaluating 16 language models. 12 small sized models of 91 - 124M parameters, and 4 medium sized models of 354M parameters.

[Here is the …

finetuning gpt gpt-2 language language models machinelearning medium small summary

Staff Research Scientist, AI/ML

@ Chan Zuckerberg Initiative | Redwood City, CA

Senior Machine Learning Engineer, Science

@ Chan Zuckerberg Initiative | Redwood City, California

AI Innovator in Healthcare

@ GAIA AG | Remote, Germany

Senior Machine Learning Engineer

@ Kintsugi | remote

Staff Machine Learning Engineer (Tech Lead)

@ Kintsugi | Remote

R_00029290 Lead Data Modeler – Remote

@ University at Buffalo | Austin, TX