April 10, 2024, 8:20 p.m. | /u/LahmacunBear

Machine Learning www.reddit.com

A lot of my work so far has been on optimisers and architecture, but have only ever tested them on small token prediction language tasks when publishing findings. What would you need to see to be convinced that a novel general approach was truly superior? Specific datasets and model sizes and relevant benchmarks would be extremely appreciated.

architecture architectures datasets ever general language machinelearning novel prediction publishing small tasks testing them token work

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Machine Learning Engineer

@ Samsara | Canada - Remote