May 23, 2024, 2:19 p.m. | /u/dark_surfer

Machine Learning www.reddit.com

https://preview.redd.it/8l04pnfhq62d1.png?width=661&format=png&auto=webp&s=7fe616ca8cd7da974070c86b6b47ffab3ab545e5

---------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------------------------------------------

https://preview.redd.it/hr7fr1uiq62d1.png?width=688&format=png&auto=webp&s=bd3de359bfe4c1ed82d092be92ae38c246bdfda2

---------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------------------------------------------

https://preview.redd.it/v6k3v39kq62d1.png?width=450&format=png&auto=webp&s=c0abb0e397a498ef7ccfb35b1b1cb598198f66ad



For anyone looking to compare the Phi-3 benchmarks in one place.

Interesting comparisons for: ANLI, Hellaswag, MedQA, TriviaQA, Language understanding, Factual Knowledge and Robustness.

Note: Phi-3 mini model table have labels in different order.

benchmarks knowledge labels language language understanding machinelearning phi phi-3 robustness table understanding

Senior Data Engineer

@ Displate | Warsaw

Decision Scientist

@ Tesco Bengaluru | Bengaluru, India

Senior Technical Marketing Engineer (AI/ML-powered Cloud Security)

@ Palo Alto Networks | Santa Clara, CA, United States

Associate Director, Technology & Data Lead - Remote

@ Novartis | East Hanover

Product Manager, Generative AI

@ Adobe | San Jose

Associate Director – Data Architect Corporate Functions

@ Novartis | Prague