all AI news
Exploring Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators
April 9, 2024, 4:43 a.m. | Jan Klhufek, Miroslav Safar, Vojtech Mrazek, Zdenek Vasicek, Lukas Sekanina
cs.LG updates on arXiv.org arxiv.org
Abstract: Energy efficiency and memory footprint of a convolutional neural network (CNN) implemented on a CNN inference accelerator depend on many factors, including a weight quantization strategy (i.e., data types and bit-widths) and mapping (i.e., placement and scheduling of DNN elementary operations on hardware units of the accelerator). We show that enabling rich mixed quantization schemes during the implementation can open a previously hidden space of mappings that utilize the hardware resources more effectively. CNNs utilizing …
abstract accelerator accelerators arxiv cnn convolutional neural network cs.ar cs.lg data deep neural network dnn efficiency elementary energy energy efficiency hardware inference mapping memory network neural network operations placement quantization scheduling strategy synergy type types units
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US