DeepMind introduces Hawk and Griffin [R] | allainews.com

March 1, 2024, 4:28 a.m. | /u/we_are_mammals

Machine Learning www.reddit.com

[https://arxiv.org/abs/2402.19427](https://arxiv.org/abs/2402.19427)

**Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models**

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long sequences, but they are difficult to train and hard to scale. We propose Hawk, an RNN with gated linear recurrences, and Griffin, a hybrid model that mixes gated linear recurrences with local attention. Hawk exceeds the reported performance of Mamba on downstream tasks, while Griffin matches the performance of Llama-2 despite being trained on over …

attention griffin hybrid inference language language models linear local attention machinelearning networks neural networks recurrent neural networks rnn scale train

More from www.reddit.com / Machine Learning

A Multi-Agent game where LLMs must trick each other as humans until one gets caught … 6 hours ago | www.reddit.com

agent fun game humans +7

[D] How reliable is RAG currently? 7 hours ago | www.reddit.com

context context window documents machinelearning +5

[N] New Challenges in DIAMBRA Arena: 3 epic additions to our lineup of RL environments! 7 hours ago | www.reddit.com

arena challenges environments epic +1

[R] An Analysis of Linear Time Series Forecasting Models 9 hours ago | www.reddit.com

abstract analysis forecasting form +9

[D] The "it" in AI models is really just the dataset? 10 hours ago | www.reddit.com

ai models dataset machinelearning

[D] Analysis of Time To First Token (TTFT) of LLMs (10B-34B) 12 hours ago | www.reddit.com

analysis containers docker hey +10

[P] Open Source / Projects Based Machine Learning Community? 15 hours ago | www.reddit.com

building collaborations community devs +16

[R] DDPM for Timeseries Generation 17 hours ago | www.reddit.com

column data data generation dataset +13

[P] [D] Examples of client projects that you have delivered 18 hours ago | www.reddit.com

client consulting examples freelance +6

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net