Compression-aware Training of Neural Networks using Frank-Wolfe | allainews.com

Feb. 15, 2024, 5:43 a.m. | Max Zimmer, Christoph Spiegel, Sebastian Pokutta

cs.LG updates on arXiv.org arxiv.org

arXiv:2205.11921v2 Announce Type: replace
Abstract: Many existing Neural Network pruning approaches rely on either retraining or inducing a strong bias in order to converge to a sparse solution throughout training. A third paradigm, 'compression-aware' training, aims to obtain state-of-the-art dense models that are robust to a wide range of compression ratios using a single dense training run while also avoiding retraining. We propose a framework centered around a versatile family of norm constraints and the Stochastic Frank-Wolfe (SFW) algorithm that …

abstract art arxiv bias compression converge cs.lg math.oc network networks neural network neural networks paradigm pruning retraining robust solution state training type

More from arxiv.org / cs.LG updates on arXiv.org

Multifidelity domain decomposition-based physics-informed neural networks and operators for time-dependent problems 35 minutes ago | arxiv.org

abstract arxiv bias combination +15

Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation 35 minutes ago | arxiv.org

abstract arxiv cs.cl cs.lg +12

Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition 35 minutes ago | arxiv.org

arxiv cs.lg cs.mm multimodal +5

Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models 35 minutes ago | arxiv.org

abstract arxiv context cs.cl +13

Evaluating Large Language Models for Health-related Queries with Presuppositions 35 minutes ago | arxiv.org

abstract arxiv corporations cs.ai +18

Reacting like Humans: Incorporating Intrinsic Human Behaviors into NAO through Sound-Based Reactions to Fearful and … 35 minutes ago | arxiv.org

abstract arxiv cs.ai cs.lg +20

Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition 35 minutes ago | arxiv.org

abstract advances arxiv asr +26

Wireless Network Digital Twin for 6G: Generative AI as A Key Enabler 35 minutes ago | arxiv.org

abstract architectures arxiv attention +18

Optimal Embedding Dimension for Sparse Subspace Embeddings 35 minutes ago | arxiv.org

abstract arxiv big cs.ds +13

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Director, Global Success Business Intelligence

@ Salesforce | Texas - Austin

View on ai-jobs.net

Deep Learning Compiler Engineer - MLIR

@ NVIDIA | US, CA, Santa Clara

View on ai-jobs.net

Commerce Data Engineer (Remote)

@ CrowdStrike | USA TX Remote

View on ai-jobs.net