all AI news
CocktailSGD: Fine-tuning Foundation Models over 500Mbps Networks
April 24, 2023, 11:17 p.m. | Jue Wang, Binhang Yuan, Luka Rimanic, Yongjun He, Tri Dao, Beidi Chen, Christopher Re, Ce Zhang
Blog Content - TOGETHER www.together.xyz
(LLMs), is communication-intensive and so has heavily relied on centralized
data centers with fast interconnects. Can we train on slow networks and
unlock the potential of decentralized infrastructure for foundation models?
In this paper, we propose CocktailSGD, a novel communication-efficient
training framework that combines three distinct compression techniques --
random sparsification, top-K sparsification, and quantization -- to achieve
much greater compression than each individual technique alone. We justify
the benefit of …
centralized data communication data data centers decentralized distributed fine-tuning foundation framework infrastructure interconnects language language models large language large language models llms networks novel paper research training
More from www.together.xyz / Blog Content - TOGETHER
Flash-Decoding for long-context inference
6 months, 2 weeks ago |
www.together.xyz
Faster inference enables up to 5x price reduction on Together API
8 months, 2 weeks ago |
www.together.xyz
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Sr. VBI Developer II
@ Atos | Texas, US, 75093
Wealth Management - Data Analytics Intern/Co-op Fall 2024
@ Scotiabank | Toronto, ON, CA