author

Ryan Blue

Photo of Ryan Blue

Ryan Blue is the co-creator and PMC chair of Apache Iceberg and co-founder of Tabular. He is a member of the Apache Software Foundation, and is a PMC member of Apache Parquet and Avro. He loves building things.

Author's posts

November 7, 2023

Iceberg and Hudi ACID Guarantees

In this post, I make the case that Apache Iceberg is reliable and Apache Hudi is not. And the best way to do that is to contrast the design of both projects in the context of the ACID guarantees: atomicity, consistency, isolation, and durability. My aim is to accurately present the facts to educate the curious and help people make informed choices.

CDC Performance and Write Amplification
Meme - The Mandalorian: ACID Guarantees—this is the way.

October 5, 2023

Apache Iceberg 1.4.0 is available!

Yesterday the Apache Iceberg community released version 1.4.0, packed with improvements. In this post, I’ll highlight a few of those improvements that will make life easier for data practitioners.

Learn More
Photo by Ruedi Häberli on Unsplash

September 19, 2023

The Case for Independent Storage

Analytic databases are quietly going through an unprecedented transformation — one that will fundamentally change the industry for the better, by freeing data that’s being held hostage.

Change is coming
Meme: Two frames. In frame one, a player is holding up a card that says give up control of storage or draw 25. Frame two shows a player holding 25 cards with text above that reads all databases ever.

July 6, 2023

Iceberg in Modern Data Architecture

Recent announcements about new Apache Iceberg support from Snowflake, Google, and Databricks motivated this deeper look at what Iceberg is and why it is being built into the foundation of modern data architecture.

Close up photo of ice overlayed with the iceberg logo and blog post title.

June 23, 2023

Hello, World of CDC!

Using a change log table for CDC table mirroring with Apache Iceberg introduces change data capture (CDC), introduces surprising benefits, and illustrates trade-offs of different CDC designs.

Learn more
Apache Iceberg icon

February 22, 2022

What's new in Iceberg 0.13

The Apache Iceberg community just released version 0.13 and there’s a lot of great new additions! This post covers some of the highlights that make this a great release.

Iceberg 0.13
Apache Iceberg icon

November 10, 2021

Metadata Indexing in Iceberg

This post details how Iceberg’s metadata forms an index that Iceberg uses to scale to hundreds of petabytes in a single table and to quickly find matching data, even on a single node.

Metadata Indexing
An iceberg floating in the ocean illustrated with green points and lines in 3D.

September 21, 2021

Tabular and the Iceberg Community

To be successful, the Iceberg community needs to be collaborative, neutral, and independent. To ensure that it continues to be a healthy and successful community, we're creating an open source culture at Tabular that sets best practices for working in the community.

Iceberg Community
Two small patches of ice seperated by water. A group of penguines on one side. One of them is jumping in the air to the other side.

August 29, 2021

Introducing Tabular

Introducing Tabular. We think data engineers, analysts, and scientists are most effective when they can focus on building reliable data products and answering questions, not fighting infrastructure. We’re creating a new data platform that automates maintenance and optimization to protect people from tedious concerns, while enabling them to work with the same data simultaneously across Spark, Trino, Flink, and other engines. The core of Tabular is the open source standard for huge analytic tables, Apache Iceberg.

Introducing Tabular
A huddled group of penguins in a snowscape with a partly cloudy sky.