March 22, 2024, 1:23 a.m. | /u/Whole-Watch-7980

Machine Learning www.reddit.com

What databases for quick query and storage on large datasets?

I have a 5 million row csv file that is about 1 gb of text. I also have 4 other csv files about the same size that I need to eventually combine together. However, the csv files I read into pandas are slow to read in and process. What are some database options that you would use for machine learning projects on this dataset?

I basically have 20 million rows …

csv data databases datasets eventually file files however large datasets machinelearning query storage text together

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US