March 10, 2024, 8:35 p.m. | Pedro H Goncalves

DEV Community dev.to

Recently, I've been designing a data lake to store different types of data from various sources, catering to diverse demands across different areas and levels. To determine the best file type for storing this data, I compiled points of interest, considering the needs and demands of different areas. These points include:





Tool Compatibility


Tool compatibility refers to which tools can write and read a specific file type. No/low code tools are crucial, especially when tools like Excel/LibreOffice play a significant …

basic benchmark data dataengineering data lake datascience designing diverse file lake operations spark store tool type types

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Tableau/PowerBI Developer (A.Con)

@ KPMG India | Bengaluru, Karnataka, India

Software Engineer, Backend - Data Platform (Big Data Infra)

@ Benchling | San Francisco, CA