April 23, 2024, 1:13 a.m. | /u/yusuf-bengio

Machine Learning www.reddit.com

Heard from two independent sources at MSFT (one close to Sebastien Bubeck) about the upcoming Phi-3 models:

* Three different sized models (up to 14B)
* Again, mostly synthetic and LLM-augmented training data
* Apparently some upscaling techniques on the training side
* No more Apache 2 but more restrictive license (similar to llama3)
* Mixtral level performance with much fewer parameters

I wanted to see if anyone has more insider information about the models.

apache data independent license llama3 llm machinelearning msft phi restrictive synthetic training training data upscaling

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)

@ takealot.com | Cape Town