-
Querying Table Metadata
BASICS This recipe shows how to inspect Apache Iceberg table metadata with SQL queries. Iceberg metadata Iceberg table metadata is…
-
Querying an Iceberg Table
BASICS This recipe demonstrates simple queries with Iceberg tables. Running queries in Apache Spark Spark supports two interfaces to query…
-
Connecting to Athena PySpark
GETTINGS STARTED Amazon Athena is a managed compute service that allows you to use SQL or PySpark to query data…
-
Connecting Amazon EMR Spark to an Apache Iceberg catalog
GETTING STARTED Amazon EMR is an easy way to deploy distributed data processing frameworks like Apache Spark, Apache Flink, Apache…
-
Configuring Trino
GETTING STARTED Trino is a popular open-source distributed SQL query engine that federates queries against data stored in the Hive Metastore,…
-
Configuring Apache Spark
GETTING STARTED Apache Spark provides comprehensive support for Apache Iceberg via both extended SQL syntax and stored procedures to manage…
-
Connecting to a REST Catalog
GETTING STARTED The Apache Iceberg REST catalog protocol is a standard API for interacting with any Iceberg catalog. The REST…
-
Catalogs and the REST catalog
GETTING STARTED Catalogs in Apache Iceberg The core responsibility of Iceberg is to manage a collection of files as a…
-
Why Apache Iceberg — for data warehouse users
Major data warehouse platforms such as Google BigQuery, Snowflake, AWS, and Databricks have all announced support for Apache Iceberg tables. Commercial warehouse engines seldom…
-
Why Apache Iceberg — for data lake users
INTRODUCTION If you have been working in a data lake, you’re probably very familiar with its drawbacks. You’re in luck:…
-
Data engineering with Apache Iceberg
DATA ENGINEERING Data engineers starting at Netflix attend (or used to, at least) a few hours of orientation to become…
-
Using Hidden Partitioning
DATA ENGINEERING This recipe shows how to use Apache Iceberg’s hidden partitioning to improve query performance while avoiding data quality…
-
Setting table write order
DATA ENGINEERING This recipe shows you how to set a table’s write order to instruct all writers — including background…
-
Using MERGE
DATA ENGINEERING Using MERGE One of the most useful tools that Iceberg enables is the SQL MERGE command. This recipe…
-
Creating Branches and Tags
DATA ENGINEERING This recipe shows how to create and manage tags and branches in an Apache Iceberg table. What are…