May 31, 2023, 10:55 a.m. | /u/shreyansh26

Artificial Intelligence www.reddit.com

Wrote up a blog post on the new second-order optimizer Sophia, which is showing encouraging results on LLM pretraining.

This paper has some good use of advanced optimization theory, the resources for which I have included in my blog.

Blog - [https://shreyansh26.github.io/post/2023-05-28\_sophia\_scalable\_second\_order\_optimizer\_llms/](https://shreyansh26.github.io/post/2023-05-28_sophia_scalable_second_order_optimizer_llms/)

Annotated Paper - [Sophia Annotated Paper - Github](https://github.com/shreyansh26/Annotated-ML-Papers/blob/main/ML%20Theory/Sophia%20-%20A%20Scalable%20Stochastic%20Second-order%20Optimizer%20for%20Language%20Model%20Pretraining.pdf)

advanced artificial blog good language language model llm optimization paper pre-training resources scalable stochastic theory training

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Alternant Data Engineering

@ Aspire Software | Angers, FR

Senior Software Engineer, Generative AI

@ Google | Dublin, Ireland