all AI news
E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity
March 25, 2024, 4:42 a.m. | Yun Li, Lin Niu, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang
cs.LG updates on arXiv.org arxiv.org
Abstract: Traditional pruning methods are known to be challenging to work in Large Language Models (LLMs) for Generative AI because of their unaffordable training process and large computational demands. For the first time, we introduce the information entropy of hidden state features into a pruning metric design, namely E-Sparse, to improve the accuracy of N:M sparsity on LLM. E-Sparse employs the information richness to leverage the channel importance, and further incorporates several novel techniques to put …
abstract arxiv boosting computational cs.ai cs.cl cs.lg entropy features generative hidden inference information language language model language models large language large language model large language models llms process pruning sparsity state the information through training type work
More from arxiv.org / cs.LG updates on arXiv.org
Efficient Data-Driven MPC for Demand Response of Commercial Buildings
2 days, 23 hours ago |
arxiv.org
Testing the Segment Anything Model on radiology data
2 days, 23 hours ago |
arxiv.org
Calorimeter shower superresolution
2 days, 23 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US