all AI news
Data Cleaning: Automatically Removing Bad Data
April 7, 2022, 1:49 p.m. | Brian Mattis
Towards Data Science - Medium towardsdatascience.com
Purging wrong data-type entries from numeric and character columns
Photo by No Revisions on UnsplashCleaning data is almost always one of the first steps you need to take after importing your dataset. Pandas has lots of great functions for cleaning, with functions like isnull(), dropna(), drop_duplicates(), and many more. However, there’s two major situations that aren’t covered:
- A would-be numeric column is littered with strings and booleans
- An aspiring character column has sporadic numeric values and booleans
This bad …
bad-data data data cleaning data-type pandas pandas-dataframe
More from towardsdatascience.com / Towards Data Science - Medium
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Vice President, Data Science, Marketplace
@ Xometry | North Bethesda, Maryland, Lexington, KY, Remote
Field Solutions Developer IV, Generative AI, Google Cloud
@ Google | Toronto, ON, Canada; Atlanta, GA, USA