all AI news
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
June 17, 2024, 4:45 a.m. | Shivam Aggarwal, Hans Jakob Damsgaard, Alessandro Pappalardo, Giuseppe Franco, Thomas B. Preu{\ss}er, Michaela Blott, Tulika Mitra
cs.LG updates on arXiv.org arxiv.org
Abstract: Post-training quantization (PTQ) is a powerful technique for model compression, reducing the numerical precision in neural networks without additional training overhead. Recent works have investigated adopting 8-bit floating-point formats(FP8) in the context of PTQ for model inference. However, floating-point formats smaller than 8 bits and their relative comparison in terms of accuracy-hardware cost with integers remains unexplored on FPGAs. In this work, we present minifloats, which are reduced-precision floating-point formats capable of further reducing the …
abstract arxiv compression context cs.ai cs.ar cs.cv cs.lg cs.pf fpgas however inference model inference networks neural networks numerical precision quantization replace training type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Focused Biochemistry Postdoctoral Fellow
@ Lawrence Berkeley National Lab | Berkeley, CA
Senior Data Engineer
@ Displate | Warsaw
Data Architect
@ Unison Consulting Pte Ltd | Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
Data Architect
@ Games Global | Isle of Man, Isle of Man
Enterprise Data Architect
@ Ent Credit Union | Colorado Springs, CO, United States
Lead Data Architect (AWS, Azure, GCP)
@ CapTech Consulting | Chicago, IL, United States