all AI news
APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference. (arXiv:2201.08830v1 [cs.AR])
cs.LG updates on arXiv.org arxiv.org
Data accesses between on- and off-chip memories account for a large fraction
of overall energy consumption during inference with deep learning networks. We
present APack, a simple and effective, lossless, off-chip memory compression
technique for fixed-point quantized models. APack reduces data widths by
exploiting the non-uniform value distribution in deep learning applications.
APack can be used to increase the effective memory capacity, to reduce off-chip
traffic, and/or to achieve the desired performance/energy targets while using
smaller off-chip memories. APack builds …
ar arxiv compression data deep learning deep learning inference learning