April 19, 2024, 6:28 a.m. | /u/analyticalmonk

Machine Learning www.reddit.com

Latest releases of models such as GPT-4 and Claude have a significant jump in the maximum context length (4/8k -> 128k+). The progress in terms of number of tokens that can be processed by these models sound remarkable in % terms.

What has led to this? Is this something that's happened purely because of increased compute becoming available during training? Are there algorithmic advances that have led to this?

claude context gpt gpt-4 language language models large language large language models machinelearning maximum multiple progress releases scientific sound technical terms tokens

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Sr. VBI Developer II

@ Atos | Texas, US, 75093

Wealth Management - Data Analytics Intern/Co-op Fall 2024

@ Scotiabank | Toronto, ON, CA