all AI news
AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference
March 5, 2024, 2:45 p.m. | Xuanlei Zhao, Shenggan Cheng, Guangyang Lu, Jiarui Fang, Haotian Zhou, Bin Jia, Ziming Liu, Yang You
cs.LG updates on arXiv.org arxiv.org
Abstract: Large deep learning models have achieved impressive performance across a range of applications. However, their large memory requirements, including parameter memory and activation memory, have become a significant challenge for their practical serving. While existing methods mainly address parameter memory, the importance of activation memory has been overlooked. Especially for long input sequences, activation memory is expected to experience a significant exponential growth as the length of sequences increases. In this approach, we propose AutoChunk, …
abstract applications arxiv automated become challenge cs.dc cs.lg cs.pf deep learning importance inference memory performance practical requirements type
More from arxiv.org / cs.LG updates on arXiv.org
Testing the Segment Anything Model on radiology data
2 days, 6 hours ago |
arxiv.org
Calorimeter shower superresolution
2 days, 6 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US