all AI news
[R] LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens - Microsoft 2024
Feb. 23, 2024, 4:23 p.m. | /u/Singularian2501
Machine Learning www.reddit.com
Abstract:
>Large context window is a desirable feature in large language models (LLMs). However, due to high fine-tuning costs, scarcity of long texts, and catastrophic values introduced by new token positions, current extended context windows are limited to around 128k tokens. This paper introduces LongRoPE that, for the first time, extends the context window of **pre-trained LLMs to an impressive 2048k tokens, with up to only 1k fine-tuning steps at within 256k training lengths, while maintaining performance at …
abstract context context window context windows costs current feature fine-tuning language language models large language large language models llms machinelearning paper token tokens values windows
More from www.reddit.com / Machine Learning
[D] Does DSPy actually change the LM weights?
1 day, 2 hours ago |
www.reddit.com
[D] Culture of Recycling Old Conference Submissions in ML
1 day, 5 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US