March 24, 2022, 3:47 p.m. | qazmkop

DEV Community dev.to

With the development of big data, the data lake era is arriving, making relevant technical personnel scarce. More and more data engineers and data lake projects are coming into the public's view. There are also open-source products, but not every open-source product is worth trying. Let's see some open projects about data lake great and even better than paid projects.


1.Hudi

Hudi is an opensour procjects providing tables, transactions, efficent upserts/deletes, advanced indexes, streaming ingestion services, data clustering/compaction optimizations, and …

big big data data database datascience github opensource programming projects

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote