May 21, 2023, 3:18 a.m. | Roger Chi

DEV Community dev.to

Step Functions released a new feature late last year called Distributed Map, which allows the service to coordinate parallel processing over huge datasets (millions of objects). Read more about it in the announcement blog here: https://aws.amazon.com/blogs/aws/step-functions-distributed-map-a-serverless-solution-for-large-scale-parallel-data-processing/


One of the input formats that can be used for the distributed map state is a .csv file in an S3 bucket. This opens up an opportunity to use the optimized Athena integration for Step Functions to generate the input .csv file that can …

athena aws csv distributed integration map s3 bucket serverless state

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst

@ SEAKR Engineering | Englewood, CO, United States

Data Analyst II

@ Postman | Bengaluru, India

Data Architect

@ FORSEVEN | Warwick, GB

Director, Data Science

@ Visa | Washington, DC, United States

Senior Manager, Data Science - Emerging ML

@ Capital One | McLean, VA