Feb. 9, 2024, 5:44 a.m. | Ruiqi Sun Siwei Ye Jie Zhao Xin He Yiran Li An Zou

cs.LG updates on arXiv.org arxiv.org

The inherent diversity of computation types within individual Deep Neural Network (DNN) models imposes a corresponding need for a varied set of computation units within hardware processors. This diversity poses a significant constraint on computation efficiency during the execution of different neural networks. In this study, we present NeuralMatrix, a framework that transforms the computation of entire DNNs into linear matrix operations. This transformation seamlessly enables the execution of various DNN models using a single General-Purpose Matrix Multiplication (GEMM) accelerator. …

computation compute cs.ai cs.ar cs.lg deep neural network diversity dnn efficiency hardware inference linear matrix network networks neural network neural networks operations processors set study types units

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

Director of Machine Learning

@ Axelera AI | Hybrid/Remote - Europe (incl. UK)

Senior Data Scientist - Trendyol Milla

@ Trendyol | Istanbul (All)

Data Scientist, Mid

@ Booz Allen Hamilton | USA, CA, San Diego (1615 Murray Canyon Rd)

Systems Development Engineer , Amazon Robotics Business Applications and Solutions Engineering

@ Amazon.com | Boston, Massachusetts, USA