This recipe shows how to inspect Apache Iceberg table metadata with SQL queries.
Iceberg table metadata is available through special tables called, unsurprisingly, metadata tables. The most commonly used metadata tables show the levels in Iceberg’s metadata tree. These include:
snapshots– known versions of the table, with summary metadata
manifests– manifest files in the current snapshot; these track data and delete files
data_files– data files in the current snapshot
delete_files– delete files in the current snapshot; these store deleted keys or file offsets
You can use these metadata tables to inspect all metadata in an Iceberg table, including partition values, lower bounds, upper bounds, and counts used for scan planning.
To query a metadata table in Apache Spark, add the metadata table name as another part in the table identifier. For example, to show the list of known table versions for the
nyc_taxi_yellow table, run:
SELECT snapshot_id, committed_at, operation, summary['spark.app.id'
Note that you need to include the full namespace because the table identifier has multiple parts.
Another useful metadata table is the
history table. It shows a log of when the current table state changed and which snapshot (by ID) became the current version. It also shows whether each historical version is an ancestor of the table’s current snapshot (in the
is_current_ancestor column), or whether it was rolled back.
Metadata tables are just like normal tables. For example, you can run an aggregation on the
data_files table to see average file sizes by partition. Or you can join the
snapshots tables to see when the current state changed, along with summary information like what Spark app created the snapshot:
FROM examples.nyc_taxi_yellow.history h
JOIN examples.nyc_taxi_yellow.snapshots s
ON h.snapshot_id = s.snapshot_id
ORDER BY made_current_at
Engines other than Spark may use different naming conventions for metadata tables. For instance, Trino uses
$ to add the metadata table to a table name and it must be quoted: