Instance-Aware Group Quantization for Vision Transformers | allainews.com

April 2, 2024, 7:43 p.m. | Jaehyeon Moon, Dohyung Kim, Junyong Cheon, Bumsub Ham

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.00928v1 Announce Type: cross
Abstract: Post-training quantization (PTQ) is an efficient model compression technique that quantizes a pretrained full-precision model using only a small calibration set of unlabeled samples without retraining. PTQ methods for convolutional neural networks (CNNs) provide quantization results comparable to full-precision counterparts. Directly applying them to vision transformers (ViTs), however, incurs severe performance degradation, mainly due to the differences in architectures between CNNs and ViTs. In particular, the distribution of activations for each channel vary drastically according …

abstract arxiv cnns compression convolutional neural networks cs.cv cs.lg however instance networks neural networks precision quantization results retraining samples set small them training transformers type vision vision transformers

More from arxiv.org / cs.LG updates on arXiv.org

APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference 11 hours ago | arxiv.org

abstract arxiv cs.cl cs.lg +15

Brain-Inspired Spiking Neural Networks for Industrial Fault Diagnosis: A Survey, Challenges, and Opportunities 11 hours ago | arxiv.org

abstract arxiv brain brain-inspired +21

Data-driven Energy Efficiency Modelling in Large-scale Networks: An Expert Knowledge and ML-based Approach 11 hours ago | arxiv.org

abstract arxiv challenge complexity +23

Learned Regularization for Inverse Problems: Insights from a Spectral Model 11 hours ago | arxiv.org

abstract art arxiv convergence +14

LLMs cannot find reasoning errors, but can correct them given the error location 11 hours ago | arxiv.org

abstract arxiv become chen +17

Conditional Denoising Diffusion Probabilistic Models for Data Reconstruction Enhancement in Wireless Communications 11 hours ago | arxiv.org

abstract arxiv channels communications +17

Deep ReLU networks and high-order finite element methods II: Chebyshev emulation 11 hours ago | arxiv.org

abstract arxiv continuous cs.lg +17

Robust Energy Consumption Prediction with a Missing Value-Resilient Metaheuristic-based Neural Network in Mobile App Development 11 hours ago | arxiv.org

abstract app application arxiv +21

On Universally Optimal Algorithms for A/B Testing 11 hours ago | arxiv.org

abstract a/b testing algorithm algorithms +17

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Real World Evidence Research Analyst

@ Novartis | Dublin (Novartis Global Service Center (NGSC))

View on ai-jobs.net

Senior DataOps Engineer

@ Winterthur Gas & Diesel AG | Winterthur, CH

View on ai-jobs.net