all AI news
[R] RWKV-v2-RNN : A parallelizable RNN with transformer-level LM performance, and without using attention
Web: https://www.reddit.com/r/MachineLearning/comments/umq908/r_rwkvv2rnn_a_parallelizable_rnn_with/
May 10, 2022, 7:11 p.m. | /u/bo_peng
Machine Learning reddit.com
I have built a RNN with transformer-level performance, without using attention. Moreover it supports both sequential & parallel mode in inference and training. So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx\_len, and free sentence embedding.
[https://github.com/BlinkDL/RWKV-LM](https://github.com/BlinkDL/RWKV-LM)
I am training a L24-D1024 RWKV-v2-RNN LM (430M params) on the Pile …
More from reddit.com / Machine Learning
Latest AI/ML/Big Data Jobs
Data Analyst, Patagonia Action Works
@ Patagonia | Remote
Data & Insights Strategy & Innovation General Manager
@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX
Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis
@ Ahmedabad University | Ahmedabad, India
Director, Applied Mathematics & Computational Research Division
@ Lawrence Berkeley National Lab | Berkeley, Ca
Business Data Analyst
@ MainStreet Family Care | Birmingham, AL
Assistant/Associate Professor of the Practice in Business Analytics
@ Georgetown University McDonough School of Business | Washington DC