Web: https://www.reddit.com/r/MachineLearning/comments/shh1a0/d_orthogonal_random_features_or_naive_linear/

Feb. 1, 2022, 12:30 a.m. | /u/you-get-an-upvote

Machine Learning reddit.com

Linear models are some of the most useful in all of statistics. Unfortunately, as the number of features grows large, confidence in your parameters drops quickly. When the number of parameters is greater than your number of datapoints, the model loses all statistical validity.

It is instructive to take a closer look at the closed-form solution for linear regression:

w = np.linalg.inv(X.T @ X) @ X' @ y 

(Note that X is an n-by-d matrix and y is an n-by-1 …

features machinelearning models random

Senior Data Engineer

@ DAZN | Hammersmith, London, United Kingdom

Sr. Data Engineer, Growth

@ Netflix | Remote, United States

Data Engineer - Remote

@ Craft | Wrocław, Lower Silesian Voivodeship, Poland

Manager, Operations Data Science

@ Binance.US | Vancouver

Senior Machine Learning Researcher for Copilot

@ GitHub | Remote - Europe

Sr. Marketing Data Analyst

@ HoneyBook | San Francisco, CA