A Tabular newsletter revisiting last month in Iceberg
Apache Iceberg? Spread the word by giving it a
on the apache/iceberg repo!
Iceberg Summit – May 14-15, a virtual conference
Announcing the first Iceberg Summit, organized by Tabular and Dremio, and sanctioned by the Apache Software Foundation. The event will include dozens of technical talks covering real-world experiences of data practitioners and developers working with Apache Iceberg as their table format.
Speak at the Summit: the CFP is open for one more week, so submit your Iceberg talk proposal by April 12.
Register: it’s free, so sign up to attend now.
Project updates
New Committers
Congratulations to these new Apache Iceberg committers!
- Bryan Keller from Netflix
- Renjie Liu from Rising Wave Labs
Iceberg Java
- New template and instructions for proposing spec changes or major feature improvements. This template will help keep Iceberg improvement proposals more consistent, making them easier to evaluate and track.
- Added support for Avro data encryption. We can now read and write encrypted Avro manifest.
- Improved support for function pushdown of row-level commands in Spark.
- Added support for output spec selection in rewrite data files. Now you can specify the partition spec rewriting data files.
- Release 1.5 is out, with view support and a host of other great features and fixes
- Enable metadata encryption with EncryptingFileIO.
- Track partition statistics with TableMetadata.
- Added View support for REST and JDBC catalogs.
- Added support for delete manifest rewrites in Spark
- Support for Flink version 1.18
- Support reading INT96 column in row group filter
PyIceberg, iceberg-go, and iceberg-rust
- PyIceberg is working towards the 0.6.1 patch release.
- Iceberg-rust is going at lightspeed. The upcoming 0.3.0 release will include query planning.
Bergy Blogs
- Iceberg + dbt + Trino + Hive : modern, open-source data stack
- Mirroring data (CDC) in Apache Iceberg with Debezium and Kafka Connect.
- Four reasons to choose the Iceberg REST catalog
- The essentials of Apache Iceberg
- Iceberg data with AWS Glue Catalog from Flink Job
- Skiing with Snowflake – experiments with Snowflake, Spark, Iceberg, Glue, and Project Nessie
- Graph Queries on Apache Iceberg tables?
- Hidden partitioning in Apache Iceberg
- Iceberg and the rise of the lakehouse. – “Apache Iceberg won’t just change data storage. It’s going to change how data works in every industry.”
Podcasts / Videos
- Iceberg Community Sync (3/6/24)
- Iceberg at Netflix and beyond – Ryan Blue on Software Engineering Daily
- Bergy Bits: Scaling PyIceberg goes Daft
- Bergy Bits: Graphs, no not the Excel kind
- What is Apache Iceberg? | IBM
- Apache Iceberg, Un nouveau Standard? (French)
- Open Table Formats Reshaping the Data Industry: A Deep Dive with Ryan Blue
- Integrate Iceberg REST Catalog Specification with Spark and Trino
- The Quiet Revolution in Data Architecture
- Reduce BigQuery costs with Iceberg optimized tables.
- Iceberg Community Sync (3/27/24)
- Best practices for implementing Apache Iceberg (playlist)
- Implementing ingestion for Apache Iceberg
Ecosystem Updates
- Open House, an open source lakehouse control plane from LinkedIn
- Deep Dive into modern data formats
- Dbt pizza shop demo – Modelling on Amazon Athena.
- Guide to migrating from Databricks Delta Lake to Apache Iceberg.
- Confluent simplifies integration between Kafka stream processing and Iceberg storage.
- Apache Iceberg makes headlines at Kafka Summit London.
- Emerging Data Engineering Trends You Should Check Out.
- How to create Iceberg tables in Snowflake.
Vendor Updates
- Cloudera announces an open data lakehouse with Apache Iceberg for public and private clouds.
- PuppyGraph launches its graph query engine – Execute graph queries on Iceberg tables.
- Optimizing updates to Apache Iceberg tables using DataFlint
- A comparison of Hudi and Iceberg from Starburst
Iceberg Resources
Get Started with Apache Iceberg.
Get cookin’ with recipes from the Apache Iceberg Cookbook.
Learn more about Apache Iceberg on the official Apache site.
Watch and subscribe to the Iceberg YouTube Channel.
Read up on some community blog posts.
Contribute to Iceberg.
`SELECT * FROM you JOIN iceberg_community`.
Subscribe to the Apache Iceberg mailing list.