Feb. 14, 2024, 5:43 a.m. | Liam Collins Hamed Hassani Mahdi Soltanolkotabi Aryan Mokhtari Sanjay Shakkottai

cs.LG updates on arXiv.org arxiv.org

An increasingly popular machine learning paradigm is to pretrain a neural network (NN) on many tasks offline, then adapt it to downstream tasks, often by re-training only the last linear layer of the network. This approach yields strong downstream performance in a variety of contexts, demonstrating that multitask pretraining leads to effective feature learning. Although several recent theoretical studies have shown that shallow NNs learn meaningful features when either (i) they are trained on a {\em single} task or (ii) …

adapt cs.lg layer leads linear machine machine learning network networks neural network neural networks offline paradigm performance popular pretraining relu representation representation learning tasks training

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Robotics Technician - 3rd Shift

@ GXO Logistics | Perris, CA, US, 92571