March 19, 2024, 5:23 p.m. | /u/danielhanchen

Machine Learning www.reddit.com

Hey r/MachineLearning! Maybe you might have seen me post on [Twitter](https://twitter.com/danielhanchen/status/1765446273661075609), but I'll just post here if you don't know about 8 bugs in multiple implementations on Google's Gemma :) The fixes should already be pushed into HF's transformers main branch, and Keras, Pytorch Gemma, vLLM should have gotten the fix :) [https://github.com/huggingface/transformers/pull/29402](https://github.com/huggingface/transformers/pull/29402)

By comparing 5 implementations, I found the following issues:

1. Must add or else losses will be very high.
2. There’s a typo for model in the …

bfloat16 found keras losses machinelearning mixed report rope technical will

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Scientist

@ Publicis Groupe | New York City, United States

Bigdata Cloud Developer - Spark - Assistant Manager

@ State Street | Hyderabad, India