Sept. 28, 2022

I am reading this [blog](https://towardsdatascience.com/learning-theory-empirical-risk-minimization-d3573f90ff77) and I am not able to understand the following inequality.



Like here first we are making a set of all the hypothesis where we have some true error (> epsilon). Then from that we made a separate set, M which has empirical error 0. Now how come S is a subset of M ? And how we are getting this inequality.

