How do AI researchers know create novel architectures? What do they know which I don't? | allainews.com

Feb. 11, 2024, 11:35 a.m. | /u/mono1110

Deep Learning www.reddit.com

For example take transformer architecture or attention mechanism. How did they know that by combining self attention with layer normalisation, positional encoding we can have models that will outperform lstm, CNNs?

I am asking this from the perspective of mathematics. Currently I feel like I can never come up with something new, and there is something missing which ai researchers know which I don't.

So what do I need to know that will allow me to solve problems in new …

ai researchers architecture architectures attention cnns deeplearning encoding example layer lstm mathematics novel perspective positional encoding researchers transformer transformer architecture will

More from www.reddit.com / Deep Learning

A visual deep dive into Uber's ML system to solve the billion dollar problem of … 7 hours ago | www.reddit.com

algorithms attention billion deep dive +11

What are explainable neural networks? 21 hours ago | www.reddit.com

algorithms attention basic biases +14

Stable LM 2 runs Offline on Android (Open Source) 1 day, 12 hours ago | www.reddit.com

android deeplearning offline open source +2

MOMENT: A Foundation Model for Time Series Forecasting, Classification, Anomaly Detection and Imputation 1 day, 20 hours ago | www.reddit.com

anomaly anomaly detection building carnegie mellon +17

What deep learnng theory we really need? 2 days, 11 hours ago | www.reddit.com

blackbox deep learning deeplearning kind +4

Classical ML interview 3 days, 5 hours ago | www.reddit.com

algorithms deeplearning however interview +11

Deep Learning 3 days, 7 hours ago | www.reddit.com

deep learning deeplearning

Talking face generation!! 3 days, 23 hours ago | www.reddit.com

create deeplearning face generated +5

98% training accuracy but predictions on new images are wrong - Overfitting? 4 days, 5 hours ago | www.reddit.com

accuracy data deep learning deeplearning +7

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

RL Analytics - Content, Data Science Manager

@ Meta | Burlingame, CA

View on ai-jobs.net

Research Engineer

@ BASF | Houston, TX, US, 77079

View on ai-jobs.net