April 28, 2024, 12:19 a.m. | /u/Complete_Course_9939

Data Science www.reddit.com

Hey everyone,

I’m relatively new to data science and currently working on a project that involves a dataset with over 60 columns. Many of these columns are categorical, with more than 100 unique values each.

My issue arises when I try to apply one-hot encoding to these categorical columns. It seems like I’m running into the curse of dimensionality problem, and I’m not quite sure how to proceed from here.

I’d really appreciate some advice or guidance on how to …

advice apply categorical data data science datascience dataset encoding hey hot issue project science unique values

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York