Web: http://arxiv.org/abs/2205.14014

June 16, 2022, 1:12 a.m. | Yuxing Wang, Chu-Tak Lee, Qipeng Guo, Zhangyue Yin, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

cs.CL updates on arXiv.org arxiv.org

Transformers have made progress in miscellaneous tasks, but suffer from
quadratic computational and memory complexities. Recent works propose sparse
Transformers with attention on sparse graphs to reduce complexity and remain
strong performance. While effective, the crucial parts of how dense a graph
needs to be to perform well are not fully explored. In this paper, we propose
Normalized Information Payload (NIP), a graph scoring function measuring
information transfer on graph, which provides an analysis tool for trade-offs
between performance and …

arxiv attention graph lg self-attention

More from arxiv.org / cs.CL updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY