all AI news
Structural Pruning of Pre-trained Language Models via Neural Architecture Search
May 6, 2024, 4:42 a.m. | Aaron Klein, Jacek Golebiowski, Xingchen Ma, Valerio Perrone, Cedric Archambeau
cs.LG updates on arXiv.org arxiv.org
Abstract: Pre-trained language models (PLM), for example BERT or RoBERTa, mark the state-of-the-art for natural language understanding task when fine-tuned on labeled data. However, their large size poses challenges in deploying them for inference in real-world applications, due to significant GPU memory requirements and high inference latency. This paper explores neural architecture search (NAS) for structural pruning to find sub-parts of the fine-tuned network that optimally trade-off efficiency, for example in terms of model size or …
abstract applications architecture art arxiv bert challenges cs.cl cs.lg data example gpu however inference language language models language understanding mark memory natural natural language neural architecture search pruning requirements roberta search state them type understanding via world
More from arxiv.org / cs.LG updates on arXiv.org
Testing the Segment Anything Model on radiology data
2 days, 4 hours ago |
arxiv.org
Calorimeter shower superresolution
2 days, 4 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US