March 19, 2024, 5:23 p.m. | /u/danielhanchen

Machine Learning www.reddit.com

Hey r/MachineLearning! Maybe you might have seen me post on [Twitter](https://twitter.com/danielhanchen/status/1765446273661075609), but I'll just post here if you don't know about 8 bugs in multiple implementations on Google's Gemma :) The fixes should already be pushed into HF's transformers main branch, and Keras, Pytorch Gemma, vLLM should have gotten the fix :) [https://github.com/huggingface/transformers/pull/29402](https://github.com/huggingface/transformers/pull/29402)

By comparing 5 implementations, I found the following issues:

1. Must add or else losses will be very high.
2. There’s a typo for model in the …

bfloat16 found keras losses machinelearning mixed report rope technical will

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US