[D] What would you recommend testing new general approaches (architectures/optimisers) on? | allainews.com

April 10, 2024, 8:20 p.m. | /u/LahmacunBear

Machine Learning www.reddit.com

A lot of my work so far has been on optimisers and architecture, but have only ever tested them on small token prediction language tasks when publishing findings. What would you need to see to be convinced that a novel general approach was truly superior? Specific datasets and model sizes and relevant benchmarks would be extremely appreciated.

architecture architectures datasets ever general language machinelearning novel prediction publishing small tasks testing them token work

More from www.reddit.com / Machine Learning

[R] A Primer on the Inner Workings of Transformer-based Language Models 4 hours ago | www.reddit.com

abstract advanced authors insights +9

[Discussion] Should I go to ICML and present my paper? 16 hours ago | www.reddit.com

academia data data scientist future +10

[P] Panza: A personal email assistant, trained and running on-device 17 hours ago | www.reddit.com

assistant automated email emails +9

[Discussion] Seeking help to find the better GPU setup. Three H100 vs Five A100? 18 hours ago | www.reddit.com

70b a100 budget five +9

[D] Something I always think about, for top conferences like ICML, NeurIPS, CVPR,..etc. How many … 19 hours ago | www.reddit.com

conferences cvpr etc good +8

[D] Benchmark creators should release their benchmark datasets in stages 20 hours ago | www.reddit.com

benchmark benchmarks concerns data +11

[P] spRAG - Open-source RAG implementation for challenging real-world tasks 21 hours ago | www.reddit.com

core hey implementation machinelearning +7

[D] Paper accepted to ICML but not attending in person? 1 day ago | www.reddit.com

authors conference icml machinelearning +6

[D] Why do juniors (undergraduates or first- to second-year PhD students) have so many papers … 1 day, 2 hours ago | www.reddit.com

academic conferences etc hello +12

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Senior Machine Learning Engineer

@ Samsara | Canada - Remote

View on ai-jobs.net