Web: http://arxiv.org/abs/2110.07732

May 6, 2022, 1:12 a.m. | Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber

cs.LG updates on arXiv.org arxiv.org

Despite progress across a broad range of applications, Transformers have
limited success in systematic generalization. The situation is especially
frustrating in the case of algorithmic tasks, where they often fail to find
intuitive solutions that route relevant information to the right node/operation
at the right time in the grid represented by Transformer columns. To facilitate
the learning of useful control flow, we propose two modifications to the
Transformer architecture, copy gate and geometric attention. Our novel Neural
Data Router (NDR) …

arxiv data flow neural transformers

More from arxiv.org / cs.LG updates on arXiv.org

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC