all AI news
What Dense Graph Do You Need for Self-Attention?. (arXiv:2205.14014v4 [cs.LG] UPDATED)
Web: http://arxiv.org/abs/2205.14014
June 16, 2022, 1:11 a.m. | Yuxing Wang, Chu-Tak Lee, Qipeng Guo, Zhangyue Yin, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
cs.LG updates on arXiv.org arxiv.org
Transformers have made progress in miscellaneous tasks, but suffer from
quadratic computational and memory complexities. Recent works propose sparse
Transformers with attention on sparse graphs to reduce complexity and remain
strong performance. While effective, the crucial parts of how dense a graph
needs to be to perform well are not fully explored. In this paper, we propose
Normalized Information Payload (NIP), a graph scoring function measuring
information transfer on graph, which provides an analysis tool for trade-offs
between performance and …
More from arxiv.org / cs.LG updates on arXiv.org
Latest AI/ML/Big Data Jobs
Machine Learning Researcher - Saalfeld Lab
@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia
Project Director, Machine Learning in US Health
@ ideas42.org | Remote, US
Data Science Intern
@ NannyML | Remote
Machine Learning Engineer NLP/Speech
@ Play.ht | Remote
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY