Jan. 27, 2022, 3:30 p.m. | Jean Yves

Towards Data Science - Medium towardsdatascience.com

Achieve up to 65% performance gain using the latest S3 magic committer from Spark 3.2 and Hadoop 3.3!

Most Apache Spark users overlook the choice of an S3 committer (a protocol used by Spark when writing output results to S3), because it is quite complex and documentation about it is scarce. This choice has a major impact on performance whenever Spark writes data to S3. Since for AWS users, a large portion of Spark jobs are spent writing to S3, …

apache apache spark data engineering hadoop performance s3 spark

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US