April 23, 2024, 1:13 a.m. | /u/yusuf-bengio

Machine Learning www.reddit.com

Heard from two independent sources at MSFT (one close to Sebastien Bubeck) about the upcoming Phi-3 models:

* Three different sized models (up to 14B)
* Again, mostly synthetic and LLM-augmented training data
* Apparently some upscaling techniques on the training side
* No more Apache 2 but more restrictive license (similar to llama3)
* Mixtral level performance with much fewer parameters

I wanted to see if anyone has more insider information about the models.

apache data independent license llama3 llm machinelearning msft phi restrictive synthetic training training data upscaling

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US