all AI news
[R] HGRN2: Gated Linear RNNs with State Expansion
May 3, 2024, 9:47 a.m. | /u/SeawaterFlows
Machine Learning www.reddit.com
**Code**: [https://github.com/OpenNLPLab/HGRN2](https://github.com/OpenNLPLab/HGRN2)
**Standalone code** (1): [https://github.com/Doraemonzzz/hgru2-pytorch](https://github.com/Doraemonzzz/hgru2-pytorch)
**Standalone code** (2): [https://github.com/sustcsonglin/flash-linear-attention/tree/main/fla/models/hgrn2](https://github.com/sustcsonglin/flash-linear-attention/tree/main/fla/models/hgrn2)
**Abstract**:
>**Hierarchically gated linear RNN** (**HGRN**, Qin et al. 2023) has demonstrated competitive training speed and performance in language modeling, while offering efficient inference. However, the recurrent state size of HGRN remains relatively small, which limits its expressiveness. To address this issue, inspired by linear attention, we introduce a simple outer-product-based state expansion mechanism so that the recurrent state size can be significantly enlarged without introducing any additional …
abstract attention expansion however inference issue language linear machinelearning modeling performance product rnn simple small speed state training while
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US