August 2022 – Iceberg Community News

August 31, 2022

Over the past month, new features have been added across a broad span of the Iceberg project. In this monthly roundup, we cover a few of these big additions and what it means for Iceberg users.

Flink

Starting with Flink, the new FLIP-27 reader is now supported in SQL. This means you can now enable FLIP-27 using an opt-in configuration.

Iceberg Core

Much progress has been made around the implementation for the new branching and tagging specification. New additions from the Iceberg community have enabled performing snapshot operations using a branch reference instead of a snapshot ID.

Another recently added feature is support for table scan reporting. Metrics can now be collected during table scans to provide rich insights. The initial version only collects some basic metrics but provides a framework for the community to build more advanced metrics and easily include them in scan reports.

Spark

Some big news around the Iceberg Spark runtime is the addition of a new function catalog. This makes it possible to use functions like iceberg_version() without having to register it as a UDF. Functions like bucket and truncate have also been added to the function catalog and are available to use in Spark with any Iceberg catalog.

Python

The community is continuing to see rapid progress around the new pyiceberg library and the latest additions include a new CLI. The CLI lets you connect to an Iceberg catalog and perform various catalog operations like listing, creating, and renaming Iceberg tables. As we head full speed towards the first official release of the new pyiceberg library, many features will be additionally exposed through the CLI interface, making it easier than ever to manage tables in an Iceberg catalog!

New Engine Support

Earlier this month Bodo, a next-generation SQL and Python data processing platform, announced an alpha version of a Bodo-Iceberg connector as well as a roadmap for fully supporting Iceberg’s growing set of features. The Bodo platform is capable of massive parallel I/O and compute and pairs very well with the design of Iceberg tables. You can learn more by checking out their announcement post, Bodo & Iceberg: the Simple and Fast Open Data Warehouse of the Future

Join Us

If the work that’s happening in the Iceberg community has you pumped, head to the Community page and see all of the ways you can join in on the fun. The community welcomes contributors of thoughts, ideas, and code from all backgrounds!

August 2022 – Iceberg Community News

Flink

Iceberg Core

Spark

Python

New Engine Support

Join Us

Related Posts

March 2023 – Iceberg Community News

February 2023 – Iceberg Community News

November 2022 – Iceberg Community News