June 6, 2022, 2:45 p.m. | /u/Evolving_Richie

Data Science www.reddit.com

Hi all,

Pretty much the title; if I have a good amount of data (say, 3 million rows, 10 cols). Parquet will on average have smaller file size and as such be faster to read into R or python or whatever. What situations would I prefer csv?

The ones I can think of are:

\- I occasionally want to open in excel (but opening a file this big risks crashing my computer)
\- I work with someone who prefers CSV …

csv datascience datasets large datasets parquet

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne