A Tabular newsletter revisiting the last month in Apache Iceberg
❤️Apache Iceberg? Spread the word by giving it a ⭐ on the apache/iceberg repo!
Project updates
Iceberg Java
- Added encryption support for Parquet and Avro data.
- Added Spark support for reading, dropping, and renaming views.
- Initial project setup for bringing the Kafka Connect Iceberg Sink Connector into the Apache Iceberg project.
- Added Spark support for controlling delete file granularity. Now delete files can be created at the partition or file level.
- Removed support for Spark 3.2.
- Parallelized file footer reads in add_files procedure.
- Updated the new docs in preparation for moving to /site and /docs directories in the main Iceberg repo. This will make it easier to contribute to and release the Iceberg docs.
PyIceberg, iceberg-go, and iceberg-rust
- Exciting progress in PyIceberg, including
- The news we’ve all been waiting for… Write support has been added to PyIceberg! This opens the door to tons of new use cases!
- Added commit support for the SQL catalog. This enables using a local catalog, making it much easier to run locally or in a notebook.
- Also added commit support for Hive and Glue.
- Name mapping support will allow the reading of parquet files without field IDs.
- An expression system was added to iceberg-rust, paving the way for more expressions to be added.
- The rust-iceberg project also gained basic file scan planning.
- The Rust API now has documentation available at https://rust.iceberg.apache.org.
Bergy Blogs
- Streaming Event Data to Iceberg with Kafka Connect. An introduction to the Iceberg Sink connector for Kafka Connect.
- Streaming CDC data from MySql to Apache Iceberg with Hive Metastore using Apache Flink.
- How not to use Apache Iceberg! Great advice for getting the most out of Iceberg.
- Migrate existing Amazon S3 data lake tables to Snowflake Unmanaged Iceberg tables using AWS Glue.
- Why You Should Use Apache Iceberg with PySpark.
- 2024 Lakehouse Format Rundown. Good analysis of the current state of table formats.
- Learning Apache Iceberg — storing the data to Minio S3. The third installment of a series on learning Iceberg.
Ecosystem Updates
- Breaking the Ice, Apache Iceberg Meetup in Tel Aviv, 11 March 2024.
- Netflix Creates Incremental Processing Solution Using Maestro and Apache Iceberg.
- Integrating Databricks with Snowflake-managed Iceberg Tables.
- Apache Iceberg vs Parquet – File Formats vs Table Formats.
- Use Amazon Athena with Spark SQL for your open-source transactional table formats.
- Used Iceberg tables in AWS Athena to generate daily snapshot data.
Vendor Updates
Iceberg Resources
🏁 Get Started with Apache Iceberg.
🧑🍳Get cookin’ with recipes from the Apache Iceberg Cookbook.
👩🏫 Learn more about Apache Iceberg on the official Apache site.
📺 Watch and subscribe to the Iceberg YouTube Channel.
📰 Read up on some community blog posts.
🫴🏾 Contribute to Iceberg.
👥 `SELECT * FROM you JOIN iceberg_community`.
📬 Subscribe to the Apache Iceberg mailing list.