Jan. 28, 2022

Data Science

I have a data frame that will be coming next week, and I need to start working on it, the first step I'll do is to clean it. My question is what do you usually look for when cleaning a set? like duplicates, formatting problems and what?

I need guidance on how to start and what to look for?

Also, when you remove identical rows/duplicates how do you make sure they're duplicate and not just other identical rows?

