May 19, 2022, 1:10 a.m. | Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, Shuchang Zhou

cs.CV updates on arXiv.org arxiv.org

Network quantization significantly reduces model inference complexity and has
been widely used in real-world deployments. However, most existing quantization
methods have been developed mainly on Convolutional Neural Networks (CNNs), and
suffer severe degradation when applied to fully quantized vision transformers.
In this work, we demonstrate that many of these difficulties arise because of
serious inter-channel variation in LayerNorm inputs, and present, Power-of-Two
Factor (PTF), a systematic method to reduce the performance degradation and
inference complexity of fully quantized vision transformers. …

arxiv cv quantization training transformer vision

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Social Insights & Data Analyst (Freelance)

@ Media.Monks | Jakarta

Cloud Data Engineer

@ Arkatechture | Portland, ME, USA