Web: https://www.reddit.com/r/MachineLearning/comments/umq908/r_rwkvv2rnn_a_parallelizable_rnn_with/

May 10, 2022, 7:11 p.m. | /u/bo_peng

Machine Learning reddit.com

Hi guys. I am an independent researcher and you might know me (BlinkDL) if you are in the EleutherAI discord.

I have built a RNN with transformer-level performance, without using attention. Moreover it supports both sequential & parallel mode in inference and training. So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx\_len, and free sentence embedding.

[https://github.com/BlinkDL/RWKV-LM](https://github.com/BlinkDL/RWKV-LM)

I am training a L24-D1024 RWKV-v2-RNN LM (430M params) on the Pile …

attention machinelearning performance rnn transformer

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC