Web: https://www.reddit.com/r/MachineLearning/comments/shh1a0/d_orthogonal_random_features_or_naive_linear/

Feb. 1, 2022, 12:30 a.m. | /u/you-get-an-upvote

Machine Learning reddit.com

Linear models are some of the most useful in all of statistics. Unfortunately, as the number of features grows large, confidence in your parameters drops quickly. When the number of parameters is greater than your number of datapoints, the model loses all statistical validity.

It is instructive to take a closer look at the closed-form solution for linear regression:

w = np.linalg.inv(X.T @ X) @ X' @ y 

(Note that X is an n-by-d matrix and y is an n-by-1 …

