-
Partitioning for Correctness (and Performance)
Partition design is a critical part of data modeling. Unfortunately, given the constraints of most Hive-based tables, data engineers (myself…
-
Multiple Engines, Single Catalog – The Impact of Adopting an Open Table Format in a Data-Driven Organization
Many powerful and polished compute engines are available today, each with its target use case. A typical organization today has…
-
Table Maintenance: The Key To Keeping Your Iceberg Tables Healthy and Performant
Tables at scale have always required a disciplined approach to maintenance. Skilled data engineers have learned best practices to optimize…
-
Iceberg’s Guiding Light: The Iceberg Open Table Format Specification
If you’ve worked with Iceberg tables, you may have come across the table property format-version and wondered what the difference is between…