all AI news
DeepMind introduces Hawk and Griffin [R]
March 1, 2024, 4:28 a.m. | /u/we_are_mammals
Machine Learning www.reddit.com
**Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models**
Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale. We propose Hawk, an RNN with gated linear recurrences, and Griffin, a hybrid model that mixes gated linear recurrences with local attention. Hawk exceeds the reported performance of Mamba on downstream tasks, while Griffin matches the performance of Llama-2 despite being trained on over …
attention griffin hybrid inference language language models linear local attention machinelearning networks neural networks recurrent neural networks rnn scale train
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne