March 6, 2024, 5:42 a.m. | Manfred K. WarmuthGoogle Inc, Wojciech Kot{\l}owskiInstitute of Computing Science, Poznan University of Technology, Poznan, Poland, Matt JonesUniversi

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.02697v1 Announce Type: cross
Abstract: It is well known that the class of rotation invariant algorithms are suboptimal even for learning sparse linear problems when the number of examples is below the "dimension" of the problem. This class includes any gradient descent trained neural net with a fully-connected input layer (initialized with a rotationally symmetric distribution). The simplest sparse problem is learning a single feature out of $d$ features. In that case the classification error or regression loss grows with …

abstract algorithms arxiv class cs.lg examples gradient layer linear neural net noise rotation stat.ml targets type

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (Digital Business Analyst)

@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore