April 19, 2024, 6:28 a.m. | /u/analyticalmonk

Machine Learning www.reddit.com

Latest releases of models such as GPT-4 and Claude have a significant jump in the maximum context length (4/8k -> 128k+). The progress in terms of number of tokens that can be processed by these models sound remarkable in % terms.

What has led to this? Is this something that's happened purely because of increased compute becoming available during training? Are there algorithmic advances that have led to this?

claude context gpt gpt-4 language language models large language large language models machinelearning maximum multiple progress releases scientific sound technical terms tokens

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US