A Tabular newsletter revisiting last month in Iceberg
❤️Apache Iceberg? Spread the word by giving it a ⭐ on the apache/iceberg repo!
Iceberg Summit – May 14-15, a virtual conference
The 1st Iceberg Summit is NEXT WEEK! The event will include over 30 technical sessions covering real-world experiences of data practitioners and developers working with Apache Iceberg as their table format.
Click here to see the full agenda:
Notable speakers include:
Russell Spitzer – Iceberg Committer, Engineering Manager, Apple
Ben Dilday – Senior Software Engineer, Bloomberg
Hao Lin – Senior Software Development Engineer, ByteDance
Sreyashi Das – Senior Data Engineer, Netflix
Alvaro Santos Andres – Data Solution Architect, Syngenta Group
Walaa Edlin Moustafa – Senior Staff Software Engineer, LinkedIn
Register and selection your sessions: its free and online, so sign up to attend now.
Project updates
Iceberg Java
- REST catalog’s HTTP client supports improvements:
- Improved the default implementation of new FileIO methods to prevent extra calls to the object store.
PyIceberg, iceberg-go, and iceberg-rust
- PyIceberg:
- Support for writing to partitioned tables that use identity partitioning.
- Adding partitions and Refs metadata tables. Along with other metadata tables already added and those coming soon, this will open PyIceberg to more maintenance and integration use cases.
- PyIceberg Pyodide integration: This enables us to run pyIceberg in the browser via WASM without requiring an install for folks to learn about Iceberg and table formats (pyodide/pyodide#4644) (pyodide/pyodide#4648), we’re still waiting on the pyArrow-Pyodide integration.
- Rust: Implement projection to perform partition-based pruning.
Bergy Blogs
- Write-audit-publish for data lakes in pure Python (No JVM) – An open source implementation of WAP using Apache Iceberg, Lambdas, and Project Nessie all running entirely Python
- How to ETL from web APIs into an open data lakehouse with Python, Iceberg, and Snowflake.
- A deep dive into Apache Iceberg and resources for learn more.
Podcasts / Videos
- Iceberg Community Sync (April 17)
- Deep dive into Apache Iceberg catalogs.
- How to accelerate Apache Iceberg queries.
- How Parquet files relate to table formats like Iceberg.
- Cutting storage and compute costs with Iceberg and Tabular.
- Creating Apache Iceberg tables using Starburst Galaxy.
Ecosystem Updates
- Iceberg Summit Unveils Speaker Lineup for Free, Virtual Event.
- Athena announces federated query pass-through.
- ASF announces Apache Hive 4.0 with Apache Iceberg support.
- Accelerating AI with an open, modern data lakehouse.
- Query Apache Kafka with SQL.
- Teradata embraces open table formats.
Vendor Updates
- Iceberg Tables: A new data source for Oracle Autonomous Database.
- Fully-managed Open Lakehouse Platform on Starburst Galaxy.
- What’s the difference between Iceberg and Delta Lake?
- Tableau and Databricks expand strategic partnership.
Iceberg Resources
🏁 Get Started with Apache Iceberg.
🧑🍳Get cookin’ with recipes from the Apache Iceberg Cookbook.
👩🏫 Learn more about Apache Iceberg on the official Apache site.
📺 Watch and subscribe to the Iceberg YouTube Channel.
📰 Read up on some community blog posts.
🫴🏾 Contribute to Iceberg.
👥 `SELECT * FROM you JOIN iceberg_community`.
📬 Subscribe to the Apache Iceberg mailing list.