July 17, 2023, 8 a.m. | Rafal Gancarz

InfoQ - AI, ML & Data Engineering www.infoq.com

Yelp created a solution to sanitize data from the corrupted Apache Cassandra cluster utilizing its data streaming architecture. The team explored many potential options to address the data corruption issue but, ultimately, had to move the data into a new cluster to remove corrupted records in the process.

By Rafal Gancarz

ai apache apache cassandra architecture architecture & design case study cassandra change-data-capture cloud computing cluster data data pipelines data streaming development devops ec2 issue kubernetes kubernetes operator ml & data engineering process rebuilds records solution streaming team yelp

More from www.infoq.com / InfoQ - AI, ML & Data Engineering

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Intern Large Language Models Planning (f/m/x)

@ BMW Group | Munich, DE

Data Engineer Analytics

@ Meta | Menlo Park, CA | Remote, US