March 23, 2024, 5:39 p.m. | danial.shabbir

DEV Community dev.to




Overview


Our primary goal is to convert the raw dataset into structured Dimension and Fact tables, allowing for efficient analysis and modelling. This process involves data cleaning and creating specific dimension tables covering attributes like date-time, passenger count, trip distance, payment types, and more.


We will touch the fundamentals of data modelling, granularity and basic data engineering terminologies in simple human friendly terms.


Additionally, we’ll explore automation using Python libraries, such as pandas, DuckDB and highlight the advantages of the …

analysis analytics cleaning count data data cleaning data engineering data modeling datascience dataset duckdb engineering fundamentals modeling modelling overview payment process programming python raw tables trip types will workflow

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Field Sample Specialist (Air Sampling) - Eurofins Environment Testing – Pueblo, CO

@ Eurofins | Pueblo, CO, United States

Camera Perception Engineer

@ Meta | Sunnyvale, CA