March 23, 2024, 5:39 p.m. | danial.shabbir

DEV Community dev.to




Overview


Our primary goal is to convert the raw dataset into structured Dimension and Fact tables, allowing for efficient analysis and modelling. This process involves data cleaning and creating specific dimension tables covering attributes like date-time, passenger count, trip distance, payment types, and more.


We will touch the fundamentals of data modelling, granularity and basic data engineering terminologies in simple human friendly terms.


Additionally, we’ll explore automation using Python libraries, such as pandas, DuckDB and highlight the advantages of the …

analysis analytics cleaning count data data cleaning data engineering data modeling datascience dataset duckdb engineering fundamentals modeling modelling overview payment process programming python raw tables trip types will workflow

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US