Demystifying the Parquet File Format | allainews.com

Aug. 17, 2022, 2:19 p.m. | Michael Berk

Towards Data Science - Medium towardsdatascience.com

The default file format for any data science workflow

Have you ever used pd.read_csv() in pandas? Well, that command could have run ~50x faster if you had used parquet instead of CSV.

Photo by Mike Benna on Unsplash

In this post we will discuss apache parquet, an extremely efficient and well-supported file format. The post is geared towards data practitioners (ML, DE, DS) so we’ll be focusing on high-level concepts and using SQL to talk through core concepts, but links …

data data lake data science data warehouse editors pick format parquet

More from towardsdatascience.com / Towards Data Science - Medium

Evaluate RAGs Rigorously or Perish 5 hours ago | towardsdatascience.com

artificial intelligence data science large language models optimization +1

Why Data Science May Not Be For You 5 hours ago | towardsdatascience.com

artificial intelligence career careers data +6

Enhance Your Network with the Power of a Graph DB 14 hours ago | towardsdatascience.com

code data data analysis data science +11

Dissolving map boundaries in QGIS and Python 15 hours ago | towardsdatascience.com

country datasets example geopandas +10

Why and When to Use the Generalized Method of Moments 1 day, 2 hours ago | towardsdatascience.com

data science econometrics estimations method-of-moment +1

Create an A.I. Driven Product with Computer Vision and ChatGPT 1 day, 4 hours ago | towardsdatascience.com

apps cancer chatgpt computer +16

Deep Dive into LlaMA 3 by Hand ✍️ 1 day, 9 hours ago | towardsdatascience.com

architecture author deep dive explore +12

On handling precalculated hierarchical data in Power BI 1 day, 10 hours ago | towardsdatascience.com

case concept data data analysis +11

Turn Llama 3 into an Embedding Model with LLM2Vec 1 day, 10 hours ago | towardsdatascience.com

data data science embedding embedding-model +7

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net