[D] Memory inside sequence models | allainews.com

March 14, 2024, 6:23 p.m. | /u/Sinestro101

Machine Learning www.reddit.com

I’m trying to gain a deeper understanding of the concept of memory in RNNs (and it’s variants) and in Transformers.

Aside the architectural differences between the plain RNN, GRU and LSTM, memory is basically the input sequence being processed through some mathematical function and served in a sequential manner as input to the next time step (along input Xt) sort of as a prior representation of the data.

From this technical perspective memory seems constrained to the length of the …

concept differences function gru inside lstm machinelearning memory next rnn through transformers understanding variants

More from www.reddit.com / Machine Learning

[D] Llama-3 based OpenBioLLM-70B & 8B: Outperforms GPT-4, Gemini, Meditron-70B, Med-PaLM-1 & Med-PaLM-2 in Medical-domain 3 hours ago | www.reddit.com

70b art biomedical domain +16

How do I convince my superior to do data preprocessing? [D] 3 hours ago | www.reddit.com

ai engineer build chat chatbots +11

[D] Llama-3 based OpenBioLLM-70B & 8B: Outperforms GPT-4, Gemini, Meditron-70B, Med-PaLM-1 & Med-PaLM-2 in Medical-domain 4 hours ago | www.reddit.com

70b art biomedical domain +16

[D] Mathematical aspects of tokenization 6 hours ago | www.reddit.com

compression educational encoding entropy +7

[R] Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey 7 hours ago | www.reddit.com

abstract advancement application challenges +15

[D] Does it make sense to talk about the probabilities of models? 14 hours ago | www.reddit.com

compute data likelihood machinelearning +4

Open-Sourced: Automated Data Sorting Tools [P] 22 hours ago | www.reddit.com

application automated building community +11

[D]What Nomenclature do you follow for naming ML Models? 22 hours ago | www.reddit.com

files inputs kind machinelearning +4

[R]Large language models may not be able to sample behavioral probability distributions 23 hours ago | www.reddit.com

agent agents behavior distribution +12

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Principal Applied Scientist

@ Microsoft | Redmond, Washington, United States

View on ai-jobs.net

Data Analyst / Action Officer

@ OASYS, INC. | OASYS, INC., Pratt Avenue Northwest, Huntsville, AL, United States

View on ai-jobs.net