A Tabular newsletter revisiting the last month in Apache Iceberg
Apache Iceberg? Spread the word by giving it a
on the apache/iceberg repo!
Project updates
Iceberg Java
- Added encryption support for Parquet and Avro data.
- Added Spark support for reading, dropping, and renaming views.
- Initial project setup for bringing the Kafka Connect Iceberg Sink Connector into the Apache Iceberg project.
- Added Spark support for controlling delete file granularity. Now delete files can be created at the partition or file level.
- Removed support for Spark 3.2.
- Parallelized file footer reads in add_files procedure.
- Updated the new docs in preparation for moving to /site and /docs directories in the main Iceberg repo. This will make it easier to contribute to and release the Iceberg docs.
PyIceberg, iceberg-go, and iceberg-rust
- Exciting progress in PyIceberg, including
- The news we’ve all been waiting for… Write support has been added to PyIceberg! This opens the door to tons of new use cases!
- Added commit support for the SQL catalog. This enables using a local catalog, making it much easier to run locally or in a notebook.
- Also added commit support for Hive and Glue.
- Name mapping support will allow the reading of parquet files without field IDs.
- An expression system was added to iceberg-rust, paving the way for more expressions to be added.
- The rust-iceberg project also gained basic file scan planning.
- The Rust API now has documentation available at https://rust.iceberg.apache.org.
Bergy Blogs
- Streaming Event Data to Iceberg with Kafka Connect. An introduction to the Iceberg Sink connector for Kafka Connect.
- Streaming CDC data from MySql to Apache Iceberg with Hive Metastore using Apache Flink.
- How not to use Apache Iceberg! Great advice for getting the most out of Iceberg.
- Migrate existing Amazon S3 data lake tables to Snowflake Unmanaged Iceberg tables using AWS Glue.
- Why You Should Use Apache Iceberg with PySpark.
- 2024 Lakehouse Format Rundown. Good analysis of the current state of table formats.
- Learning Apache Iceberg — storing the data to Minio S3. The third installment of a series on learning Iceberg.
Ecosystem Updates
- Breaking the Ice, Apache Iceberg Meetup in Tel Aviv, 11 March 2024.
- Netflix Creates Incremental Processing Solution Using Maestro and Apache Iceberg.
- Integrating Databricks with Snowflake-managed Iceberg Tables.
- Apache Iceberg vs Parquet – File Formats vs Table Formats.
- Use Amazon Athena with Spark SQL for your open-source transactional table formats.
- Used Iceberg tables in AWS Athena to generate daily snapshot data.
Vendor Updates
Iceberg Resources
Get Started with Apache Iceberg.
Get cookin’ with recipes from the Apache Iceberg Cookbook.
Learn more about Apache Iceberg on the official Apache site.
Watch and subscribe to the Iceberg YouTube Channel.
Read up on some community blog posts.
Contribute to Iceberg.
`SELECT * FROM you JOIN iceberg_community`.
Subscribe to the Apache Iceberg mailing list.