Web: http://arxiv.org/abs/2206.11062

June 23, 2022, 1:10 a.m. | Ibrahim Ahmed, Sahil Parmar, Matthew Boyd, Michael Beidler, Kris Kang, Bill Liu, Kyle Roach, John Kim, Dennis Abts

cs.LG updates on arXiv.org arxiv.org

Transformers have become a predominant machine learning workload, they are
not only the de-facto standard for natural language processing tasks, but they
are also being deployed in other domains such as vision and speech recognition.
Many of the transformer-based applications are real-time systems such as
machine translation and web search. These real time systems often come with
strict end-to-end inference latency requirements. Unfortunately, while the
majority of the transformer computation comes from matrix multiplications,
transformers also include several non-linear components …

arxiv bert lg on processor streaming tensor

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY