Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs | allainews.com

June 17, 2024, 4:45 a.m. | Shivam Aggarwal, Hans Jakob Damsgaard, Alessandro Pappalardo, Giuseppe Franco, Thomas B. Preu{\ss}er, Michaela Blott, Tulika Mitra

cs.LG updates on arXiv.org arxiv.org

arXiv:2311.12359v2 Announce Type: replace-cross
Abstract: Post-training quantization (PTQ) is a powerful technique for model compression, reducing the numerical precision in neural networks without additional training overhead. Recent works have investigated adopting 8-bit floating-point formats(FP8) in the context of PTQ for model inference. However, floating-point formats smaller than 8 bits and their relative comparison in terms of accuracy-hardware cost with integers remains unexplored on FPGAs. In this work, we present minifloats, which are reduced-precision floating-point formats capable of further reducing the …

abstract arxiv compression context cs.ai cs.ar cs.cv cs.lg cs.pf fpgas however inference model inference networks neural networks numerical precision quantization replace training type

More from arxiv.org / cs.LG updates on arXiv.org

Revisiting Active Learning in the Era of Vision Foundation Models 6 hours ago | arxiv.org

active learning arxiv cs.cv cs.lg +4

Fast gradient-free activation maximization for neurons in spiking neural networks 6 hours ago | arxiv.org

abstract artificial arxiv cognitive +16

Diverse Part Synthesis for 3D Shape Creation 6 hours ago | arxiv.org

abstract applications arxiv cs.cv +15

SoK: Facial Deepfake Detectors 6 hours ago | arxiv.org

abstract arxiv cs.cr cs.cv +19

XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies 6 hours ago | arxiv.org

arxiv cs.cv cs.gr cs.lg +7

Accelerating Electronic Stopping Power Predictions by 10 Million Times with a Combination of Time-Dependent Density … 6 hours ago | arxiv.org

abstract arxiv combination cond-mat.mtrl-sci +24

Jigsaw: Supporting Designers to Prototype Multimodal Applications by Chaining AI Foundation Models 6 hours ago | arxiv.org

abstract ai foundation ai foundation models applications +21

Analysis of learning a flow-based generative model from limited sample complexity 6 hours ago | arxiv.org

abstract analysis arxiv autoencoder +13

PiPar: Pipeline Parallelism for Collaborative Machine Learning 6 hours ago | arxiv.org

abstract arxiv collaborative cs.dc +19

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

View on ai-jobs.net

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Data Architect

@ Unison Consulting Pte Ltd | Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia

View on ai-jobs.net

Data Architect

@ Games Global | Isle of Man, Isle of Man

View on ai-jobs.net

Enterprise Data Architect

@ Ent Credit Union | Colorado Springs, CO, United States

View on ai-jobs.net

Lead Data Architect (AWS, Azure, GCP)

@ CapTech Consulting | Chicago, IL, United States

View on ai-jobs.net