April 20, 2024, 8:10 a.m. | Keith Galli

Keith Galli www.youtube.com

I'm prepping a dataset for an upcoming tutorial and I figured walking through the process of cleaning it would work well for a livestream! We use various Python Pandas functions to accomplish our data cleaning goals.

We'll be working off of this repo:
https://github.com/KeithGalli/Olympics-Dataset

Some topics that we cover:
- How you can use web scraping to collect data like this (Python beautifulsoup).
- Splitting strings into separate columns
- Using regular expressions (regexes) to extract specific details from columns …

athletes cleaning data data cleaning dataset functions livestream pandas process python through topics tutorial walking work world

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Scientist, gTech Ads

@ Google | Mexico City, CDMX, Mexico

Lead, Data Analytics Operations

@ Zocdoc | Pune, Maharashtra, India