xLSTM: Extended Long Short-Term Memory | allainews.com

June 1, 2024, 10:23 p.m. | Yannic Kilcher

Yannic Kilcher www.youtube.com

xLSTM is an architecture that combines the recurrency and constant memory requirement of LSTMs with the large-scale training of transformers and achieves impressive results.

Paper: https://arxiv.org/abs/2405.04517

Abstract:
In the 1990s, the constant error carousel and gating were introduced as the central ideas of the Long Short-Term Memory (LSTM). Since then, LSTMs have stood the test of time and contributed to numerous deep learning success stories, in particular they constituted the first Large Language Models (LLMs). However, the advent of the …

abstract architecture error ideas long short-term memory lstm memory results scale test training transformers xlstm

More from www.youtube.com / Yannic Kilcher

Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained) 3 days, 11 hours ago | www.youtube.com

abstract ai legal explained free +20

xLSTM: Extended Long Short-Term Memory 4 weeks ago | www.youtube.com

abstract architecture error ideas +9

[ML News] OpenAI is in hot waters (GPT-4o, Ilya Leaving, Scarlett Johansson legal action) 1 month, 1 week ago | www.youtube.com

action controversies exodus flagship +17

ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained) 1 month, 4 weeks ago | www.youtube.com

abstract algorithms alignment building +14

[ML News] Chips, Robots, and Models 1 month, 4 weeks ago | www.youtube.com

accelerator adobe ai training ai training data +22

TransformerFAM: Feedback attention is working memory 2 months ago | www.youtube.com

abstract architecture attention complexity +14

[ML News] Devin exposed | NeurIPS track for high school students 2 months ago | www.youtube.com

ai-powered ai software ai software engineer devin +15

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention 2 months ago | www.youtube.com

abstract attention computation context +15

[ML News] Llama 3 changes the game 2 months ago | www.youtube.com

bitcoin btc game license +7

VP, Enterprise Applications

@ Blue Yonder | Scottsdale

View on ai-jobs.net

Data Scientist - Moloco Commerce Media

@ Moloco | Redwood City, California, United States

View on ai-jobs.net

Senior Backend Engineer (New York)

@ Kalepa | New York City. Hybrid

View on ai-jobs.net

Senior Backend Engineer (USA)

@ Kalepa | New York City. Remote US.

View on ai-jobs.net

Senior Full Stack Engineer (USA)

@ Kalepa | New York City. Remote US.

View on ai-jobs.net

Senior Full Stack Engineer (New York)

@ Kalepa | New York City., Hybrid

View on ai-jobs.net