all AI news
PQA: Exploring the Potential of Product Quantization in DNN Hardware Acceleration
April 1, 2024, 4:43 a.m. | Ahmed F. AbouElhamayed, Angela Cui, Javier Fernandez-Marques, Nicholas D. Lane, Mohamed S. Abdelfattah
cs.LG updates on arXiv.org arxiv.org
Abstract: Conventional multiply-accumulate (MAC) operations have long dominated computation time for deep neural networks (DNNs), espcially convolutional neural networks (CNNs). Recently, product quantization (PQ) has been applied to these workloads, replacing MACs with memory lookups to pre-computed dot products. To better understand the efficiency tradeoffs of product-quantized DNNs (PQ-DNNs), we create a custom hardware accelerator to parallelize and accelerate nearest-neighbor search and dot-product lookups. Additionally, we perform an empirical study to investigate the efficiency--accuracy tradeoffs of …
abstract arxiv cnns computation convolutional neural networks cs.ar cs.lg dnn efficiency hardware mac memory networks neural networks operations product products quantization type workloads
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Software Engineer, Data Tools - Full Stack
@ DoorDash | Pune, India
Senior Data Analyst
@ Artsy | New York City