Feb. 9, 2024, 5:44 a.m. | Ruiqi Sun Siwei Ye Jie Zhao Xin He Yiran Li An Zou

cs.LG updates on arXiv.org arxiv.org

The inherent diversity of computation types within individual Deep Neural Network (DNN) models imposes a corresponding need for a varied set of computation units within hardware processors. This diversity poses a significant constraint on computation efficiency during the execution of different neural networks. In this study, we present NeuralMatrix, a framework that transforms the computation of entire DNNs into linear matrix operations. This transformation seamlessly enables the execution of various DNN models using a single General-Purpose Matrix Multiplication (GEMM) accelerator. …

computation compute cs.ai cs.ar cs.lg deep neural network diversity dnn efficiency hardware inference linear matrix network networks neural network neural networks operations processors set study types units

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne