[Discussion] Are there specific technical/scientific breakthroughs that have allowed the significant jump in maximum context length across multiple large language models recently? | allainews.com

April 19, 2024, 6:28 a.m. | /u/analyticalmonk

Machine Learning www.reddit.com

Latest releases of models such as GPT-4 and Claude have a significant jump in the maximum context length (4/8k -> 128k+). The progress in terms of number of tokens that can be processed by these models sound remarkable in % terms.

What has led to this? Is this something that's happened purely because of increased compute becoming available during training? Are there algorithmic advances that have led to this?

claude context gpt gpt-4 language language models large language large language models machinelearning maximum multiple progress releases scientific sound technical terms tokens

More from www.reddit.com / Machine Learning

[D] Modern best coding practices for Pytorch (for research)? 6 hours ago | www.reddit.com

coding config example good +14

[P] I reproduced Anthropic's recent interpretability research 9 hours ago | www.reddit.com

anthropic attention basic capabilities +8

[R] KAN: Kolmogorov-Arnold Networks 10 hours ago | www.reddit.com

abstract every function functions +11

[D] Looking for a recent study/paper/article that showed that an alternate model with a similar … 10 hours ago | www.reddit.com

article conversation machinelearning nothing +4

[2404.10667] VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time 11 hours ago | www.reddit.com

audio generated machinelearning vasa +1

[D] Is RPE still a valid approach, or is RoPE entirely superior? 14 hours ago | www.reddit.com

attention datasets embed information +8

[D] TensorDock — GPU Cloud Marketplace, H100s from $2.49/hr 16 hours ago | www.reddit.com

building cloud cloud gpu gpu +17

How does freezing a model work? [D] 20 hours ago | www.reddit.com

clip encoder guides inputs +9

[D] ICML 2024 Decision Thread 20 hours ago | www.reddit.com

create decision discuss every +9

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Sr. VBI Developer II

@ Atos | Texas, US, 75093

View on ai-jobs.net

Wealth Management - Data Analytics Intern/Co-op Fall 2024

@ Scotiabank | Toronto, ON, CA

View on ai-jobs.net