Feb. 13, 2024, 5:43 a.m. | Armin Gerami Monte Hoover Pranav S. Dulepet Ramani Duraiswami

cs.LG updates on arXiv.org arxiv.org

Motivated by the factorization inherent in the original fast multipole method and the improved fast Gauss transform we introduce a factorable form of attention that operates efficiently in high dimensions. This approach reduces the computational and memory complexity of the attention mechanism in transformers from $O(N^2)$ to $O(N)$. In comparison to previous attempts, our work presents a linearly scaled attention mechanism that maintains the full representation of the attention matrix without compromising on sparsification and incorporates the all-to-all relationship between …

attention comparison complexity computational cs.ai cs.lg cs.na dimensions factorization form gauss math.na memory transformers

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

ETL Developer

@ Gainwell Technologies | Bengaluru, KA, IN, 560100

Medical Radiation Technologist, Breast Imaging

@ University Health Network | Toronto, ON, Canada

Data Scientist

@ PayPal | USA - Texas - Austin - Corp - Alterra Pkwy