July 31, 2023, 1:19 p.m. | Shivamshinde

Towards AI - Medium pub.towardsai.net

In this article, I will explain one of the most important parts of the lifecycle of a data science project, i.e., exploratory data analysis step by step with code.

Photo by Mael BALLAND on Unsplash

We will use the Kaggle Spaceship Titanic dataset to demonstrate exploratory data analysis (EDA).

The first step in the machine learning project is to explore the data. Let’s start.

Importing some basic libraries.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt …

analysis article code data data analysis data science dataset eda exploratory exploratory-data-analysis guide kaggle lifecycle missing values numpy outliers pandas practical project python science

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineering Manager, Generative AI - Characters

@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States