Feb. 12, 2024, 5:42 a.m. | Jascha Sohl-Dickstein

cs.LG updates on arXiv.org arxiv.org

Some fractals -- for instance those associated with the Mandelbrot and quadratic Julia sets -- are computed by iterating a function, and identifying the boundary between hyperparameters for which the resulting series diverges or remains bounded. Neural network training similarly involves iterating an update function (e.g. repeated steps of gradient descent), can result in convergent or divergent behavior, and can be extremely sensitive to small changes in hyperparameters. Motivated by these similarities, we experimentally examine the boundary between neural network …

cs.lg cs.ne fractal fractals function gradient instance julia network network training neural network nlin.cd series training update

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

Software Engineer III -Full Stack Developer - ModelOps, MLOps

@ JPMorgan Chase & Co. | NY, United States

Senior Lead Software Engineer - Full Stack Senior Developer - ModelOps, MLOps

@ JPMorgan Chase & Co. | NY, United States

Software Engineer III - Full Stack Developer - ModelOps, MLOps

@ JPMorgan Chase & Co. | NY, United States

Research Scientist (m/w/d) - Numerische Simulation Laser-Materie-Wechselwirkung

@ Fraunhofer-Gesellschaft | Freiburg, DE, 79104

Research Scientist, Speech Real-Time Dialog

@ Google | Mountain View, CA, USA