all AI news
Flash-Decoding for long-context inference
Oct. 12, 2023, 5:59 p.m. | Tri Dao, Daniel Haziza, Francisco Massa, Grigory Sizov
Blog Content - TOGETHER www.together.xyz
attention during inference, bringing up to 8x faster generation for very
long sequences.
More from www.together.xyz / Blog Content - TOGETHER
Flash-Decoding for long-context inference
6 months, 2 weeks ago |
www.together.xyz
Faster inference enables up to 5x price reduction on Together API
8 months, 2 weeks ago |
www.together.xyz
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Reporting & Data Analytics Lead (Sizewell C)
@ EDF | London, GB
Data Analyst
@ Notable | San Mateo, CA