June 29, 2022, 2:42 p.m. | /u/stick_shift95

Data Science www.reddit.com

Let’s say that we have a categorical variable with 3 categories: unfurnished, semi-furnished, furnished
We use pandas.get_dummies() to one hot encode this variable and a separate binary column is not created for ‘unfurnished’ due to drop_first=True.

Let's imagine that "unfurnished" is actually a very important feature in the model. How would I know this? Because I remove unfurnished as per the example above. Yes we can still tell what row is unfurnished looking at your data, but for example if …

datascience pandas

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst

@ SEAKR Engineering | Englewood, CO, United States

Data Analyst II

@ Postman | Bengaluru, India

Data Architect

@ FORSEVEN | Warwick, GB

Director, Data Science

@ Visa | Washington, DC, United States

Senior Manager, Data Science - Emerging ML

@ Capital One | McLean, VA