all AI news
The Feasibility of Implementing Large-Scale Transformers on Multi-FPGA Platforms
April 26, 2024, 4:42 a.m. | Yu Gao, Juan Camilo Vega, Paul Chow
cs.LG updates on arXiv.org arxiv.org
Abstract: FPGAs are rarely mentioned when discussing the implementation of large machine learning applications, such as Large Language Models (LLMs), in the data center. There has been much evidence showing that single FPGAs can be competitive with GPUs in performance for some computations, especially for low latency, and often much more efficient when power is considered. This suggests that there is merit to exploring the use of multiple FPGAs for large machine learning applications. The challenge …
abstract applications arxiv center cs.ar cs.dc cs.lg data data center evidence fpga fpgas gpus implementation language language models large language large language models latency llms low low latency machine machine learning machine learning applications performance platforms scale transformers type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Sr. BI Analyst
@ AkzoNobel | Pune, IN