all AI news
Next-Gen AI: RecurrentGemma (Long Context Length)
April 21, 2024, 2 p.m. | code_your_own_AI
code_your_own_AI www.youtube.com
Google developed RecurrentGemma-2B and compares this new LM architecture (!) with the classical transformer based, quadratic complexity of a self-attention Gemma 2B. And the new throughput is: about 6000 tokens per second.
Introduction and Model Architecture:
The original paper by Google introduces "RecurrentGemma-2B," leveraging the Griffin architecture, which moves away from traditional global attention mechanisms in favor of a combination of linear recurrences and local attention. This design enables the …
architecture attention brand complexity context gemma gen gen ai google introduction language language model moving next next-gen paper per self-attention tokens transformer transformers
More from www.youtube.com / code_your_own_AI
Stealth LLM: im-a-good-gpt2-chatbot
15 hours ago |
www.youtube.com
Understand DSPy: Programming AI Pipelines
2 days, 15 hours ago |
www.youtube.com
Latest Insights in AI Performance Models
4 days, 15 hours ago |
www.youtube.com
New Discovery: Retrieval Heads for Long Context
6 days, 15 hours ago |
www.youtube.com
NEW LLM Test: Reasoning & gpt2-chatbot
1 week, 1 day ago |
www.youtube.com
LLMs: Rewriting Our Tomorrow (plus code) #ai
1 week, 3 days ago |
www.youtube.com
Autonomous AI Agents: 14 % MAX Performance
1 week, 4 days ago |
www.youtube.com
480B LLM as 128x4B MoE? WHY?
1 week, 6 days ago |
www.youtube.com
Jobs in AI, ML, Big Data
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US