all AI news
A Real Time 1280x720 Object Detection Chip With 585MB/s Memory Traffic. (arXiv:2205.01571v1 [cs.AR])
May 4, 2022, 1:11 a.m. | Kuo-Wei Chang, Hsu-Tung Shih, Tian-Sheuan Chang, Shang-Hong Tsai, Chih-Chyau Yang, Chien-Ming Wu, Chun-Ming Huang
cs.LG updates on arXiv.org arxiv.org
Memory bandwidth has become the real-time bottleneck of current deep learning
accelerators (DLA), particularly for high definition (HD) object detection.
Under resource constraints, this paper proposes a low memory traffic DLA chip
with joint hardware and software optimization. To maximize hardware utilization
under memory bandwidth, we morph and fuse the object detection model into a
group fusion-ready model to reduce intermediate data access. This reduces the
YOLOv2's feature memory traffic from 2.9 GB/s to 0.15 GB/s. To support group
fusion, …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Lead Data Engineer
@ JPMorgan Chase & Co. | Jersey City, NJ, United States
Senior Machine Learning Engineer
@ TELUS | Vancouver, BC, CA
CT Technologist - Ambulatory Imaging - PRN
@ Duke University | Morriville, NC, US, 27560
BH Data Analyst
@ City of Philadelphia | Philadelphia, PA, United States