April 7, 2022, 1:49 p.m. | Brian Mattis

Towards Data Science - Medium towardsdatascience.com

Purging wrong data-type entries from numeric and character columns

Photo by No Revisions on Unsplash

Cleaning data is almost always one of the first steps you need to take after importing your dataset. Pandas has lots of great functions for cleaning, with functions like isnull(), dropna(), drop_duplicates(), and many more. However, there’s two major situations that aren’t covered:

  • A would-be numeric column is littered with strings and booleans
  • An aspiring character column has sporadic numeric values and booleans

This bad …

bad-data data data cleaning data-type pandas pandas-dataframe

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Vice President, Data Science, Marketplace

@ Xometry | North Bethesda, Maryland, Lexington, KY, Remote

Field Solutions Developer IV, Generative AI, Google Cloud

@ Google | Toronto, ON, Canada; Atlanta, GA, USA