Sept. 22, 2023, 10:37 a.m. | /u/Necessary-Bike-4034

Machine Learning www.reddit.com

I am really excited to share our newest work in deep learning: parallelizing RNN! [https://arxiv.org/abs/2309.12252](https://arxiv.org/abs/2309.12252)

RNN is thought to be non-parallelizable because of its inherent sequential nature: its state depends on its previous state. This makes training RNN for long sequence usually takes long time compared to other architecture classes (like CNN).

What we present is an algorithm based on Newton's method to evaluate and train RNN in parallel. In one of our experiment, we can achieve >1000x faster evaluation …

algorithm architecture cnn experiment machinelearning nature rnn state thought training

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

DevOps Engineer (Data Team)

@ Reward Gateway | Sofia/Plovdiv