all AI news
Write-Audit-Publish for Data Lakes in Pure Python (no JVM)
Towards Data Science - Medium towardsdatascience.com
An open source implementation of WAP using Apache Iceberg, Lambdas, and Project Nessie all running entirely Python
Look Ma: no JVM! Photo by Zac Ong on UnsplashIntroduction
In this blog post we provide a no-nonsense, reference implementation for Write-Audit-Publish (WAP) patterns on a data lake, using Apache Iceberg as an open table format, and Project Nessie as a data catalog supporting git-like semantics.
We chose Nessie because its branching capabilities provide a good abstraction to implement a WAP design. …
apache apache iceberg audit blog data data lake data lakehouse data lakes format iceberg implementation lake open source open table format patterns photo project python reference running table table format