May 23, 2024, 2:19 p.m. | /u/dark_surfer

Machine Learning www.reddit.com

https://preview.redd.it/8l04pnfhq62d1.png?width=661&format=png&auto=webp&s=7fe616ca8cd7da974070c86b6b47ffab3ab545e5

---------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------------------------------------------

https://preview.redd.it/hr7fr1uiq62d1.png?width=688&format=png&auto=webp&s=bd3de359bfe4c1ed82d092be92ae38c246bdfda2

---------------------------------------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------------------------------------------

https://preview.redd.it/v6k3v39kq62d1.png?width=450&format=png&auto=webp&s=c0abb0e397a498ef7ccfb35b1b1cb598198f66ad



For anyone looking to compare the Phi-3 benchmarks in one place.

Interesting comparisons for: ANLI, Hellaswag, MedQA, TriviaQA, Language understanding, Factual Knowledge and Robustness.

Note: Phi-3 mini model table have labels in different order.

benchmarks knowledge labels language language understanding machinelearning phi phi-3 robustness table understanding

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Senior Data Engineer

@ Displate | Warsaw

Lead, Sales Operations Strategy EMEA - 12 Month Fixed Term Contract

@ Snap Inc. | London - 50 Cowcross Street

Senior Staff Engineer- Observability and Reliability Platform Engineering (REMOTE)

@ GEICO | MD Chevy Chase (Office) - JPS

Senior Manager, Software Quality Assurance

@ IQVIA | Ottawa, Ontario, Canada

Associate, Software Application Engineer

@ BlackRock | MU8-South (A) Wing, 7-10 Floor, Nesco IT Park Tower 4, Western Express Highway, Mumbai