[R] Parallelizing RNN over its sequence length | allainews.com

Sept. 22, 2023, 10:37 a.m. | /u/Necessary-Bike-4034

Machine Learning www.reddit.com

I am really excited to share our newest work in deep learning: parallelizing RNN! [https://arxiv.org/abs/2309.12252](https://arxiv.org/abs/2309.12252)

RNN is thought to be non-parallelizable because of its inherent sequential nature: its state depends on its previous state. This makes training RNN for long sequence usually takes long time compared to other architecture classes (like CNN).

What we present is an algorithm based on Newton's method to evaluate and train RNN in parallel. In one of our experiment, we can achieve >1000x faster evaluation …

algorithm architecture cnn experiment machinelearning nature rnn state thought training

More from www.reddit.com / Machine Learning

[P] LeRobot: Hugging Face's library for real-world robotics 4 hours ago | www.reddit.com

academia advanced advanced ai ai development +13

[D] Kolmogorov-Arnold Network is just an MLP 4 hours ago | www.reddit.com

machinelearning mlp network relu +1

[D] Why Gemma has such crazy big MLP hidden dim size? 5 hours ago | www.reddit.com

big gemma hidden machinelearning +1

[R] Why can Llama-3 work with 32K context if it only had 8K context length? 6 hours ago | www.reddit.com

32k context config context dynamic +7

[D] Is there a formal name for "dialogue classification?" 12 hours ago | www.reddit.com

agents classification customer customer service +11

How Large Language Models play video games [D] 12 hours ago | www.reddit.com

agents case engineering explore +15

[Project] An LLM-Powered Web App for SEC Filing Insights 13 hours ago | www.reddit.com

apis app financial future +18

[Research] Understanding The Attention Mechanism In Transformers: A 5-minute visual guide. 🧠 17 hours ago | www.reddit.com

architectures attention dictionary guide +12

[D] Is there a more systematic way of choosing the layers or how deep the … 21 hours ago | www.reddit.com

architecture deep learning least machinelearning +6

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

DevOps Engineer (Data Team)

@ Reward Gateway | Sofia/Plovdiv

View on ai-jobs.net