all AI news
Quoting Phi-3 Technical Report
Simon Willison's Weblog simonwillison.net
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone.
academic ai benchmarks billion generativeai gpt gpt-3 gpt-3.5 homebrewllms language language model llms microsoft mixtral mixtral 8x7b mmlu performance phi phone report small technical testing tokens