TransformerFAM: Feedback attention is working memory | allainews.com

April 28, 2024, 9:19 p.m. | Yannic Kilcher

Yannic Kilcher www.youtube.com

Paper: https://arxiv.org/abs/2404.09173

Abstract:
While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs. We propose Feedback Attention Memory (FAM), a novel Transformer architecture that leverages a feedback loop to enable the network to attend to its own latent representations. This design fosters the emergence of working memory within the Transformer, allowing it to process indefinitely long sequences. TransformerFAM requires no additional weights, enabling seamless integration with pre-trained models. Our experiments show that …

abstract architecture attention complexity deep learning design emergence feedback inputs loop memory network novel process transformer transformer architecture transformers while

More from www.youtube.com / Yannic Kilcher

ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained) 2 weeks ago | www.youtube.com

abstract algorithms alignment building +14

[ML News] Chips, Robots, and Models 2 weeks ago | www.youtube.com

accelerator adobe ai training ai training data +22

TransformerFAM: Feedback attention is working memory 2 weeks, 2 days ago | www.youtube.com

abstract architecture attention complexity +14

[ML News] Devin exposed | NeurIPS track for high school students 2 weeks, 4 days ago | www.youtube.com

ai-powered ai software ai software engineer devin +15

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention 2 weeks, 6 days ago | www.youtube.com

abstract attention computation context +15

[ML News] Llama 3 changes the game 3 weeks ago | www.youtube.com

bitcoin btc game license +7

Hugging Face got hacked 3 weeks, 6 days ago | www.youtube.com

bitcoin btc eth ethereum +5

[ML News] Microsoft to spend 100 BILLION DOLLARS on supercomputer (& more industry news) 1 month ago | www.youtube.com

billion industry machine machine learning +7

[ML News] Jamba, CMD-R+, and other new models (yes, I know this is like a … 1 month ago | www.youtube.com

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net