all AI news
Accurate Retraining-free Pruning for Pretrained Encoder-based Language Models
March 18, 2024, 4:48 a.m. | Seungcheol Park, Hojun Choi, U Kang
cs.CL updates on arXiv.org arxiv.org
Abstract: Given a pretrained encoder-based language model, how can we accurately compress it without retraining? Retraining-free structured pruning algorithms are crucial in pretrained language model compression due to their significantly reduced pruning cost and capability to prune large language models. However, existing retraining-free algorithms encounter severe accuracy degradation, as they fail to handle pruning errors, especially at high compression rates. In this paper, we propose K-prune (Knowledge-preserving pruning), an accurate retraining-free structured pruning algorithm for pretrained …
abstract accuracy algorithms arxiv capability compression cost cs.cl encoder free however language language model language models large language large language models pretrained language model pruning retraining type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Machine Learning Engineer
@ Samsara | Canada - Remote