Jan. 17, 2022, 4:21 p.m. | Edwin Tan

Towards Data Science - Medium towardsdatascience.com

Stop using Pandas

Photo by Erik Mclean on Unsplash

Pandas library has became the de facto library for data manipulation in python and is widely used by data scientist and analyst. However, there are times when the dataset is too large and Pandas may run into memory errors. Here are 8 alternatives to Pandas for dealing with large datasets. For each alternative library, we will examine how to load data from CSV and perform a simple groupby operation. Fortunately many …

big data data science datasets machine learning pandas processing python

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Social Insights & Data Analyst (Freelance)

@ Media.Monks | Jakarta

Cloud Data Engineer

@ Arkatechture | Portland, ME, USA