-
Creating a Table from CSV
BASICS This recipe demonstrates how to create Apache Iceberg tables from CSV files. This focuses on ensuring the schema for…
-
Creating a Table from Parquet
BASICS This recipe demonstrates how to create Apache Iceberg tables from Parquet. This focuses on ensuring the schema for the…
-
Time Travel Queries
BASICS This recipe demonstrates ways to query historical snapshots of Apache Iceberg tables. Time travel to query historical snapshots in…
-
Querying Table Metadata
BASICS This recipe shows how to inspect Apache Iceberg table metadata with SQL queries. Iceberg metadata Iceberg table metadata is…
-
Querying an Iceberg Table
BASICS This recipe demonstrates simple queries with Iceberg tables. Running queries in Apache Spark Spark supports two interfaces to query…
-
Connecting to Athena PySpark
GETTINGS STARTED Amazon Athena is a managed compute service that allows you to use SQL or PySpark to query data…
-
Connecting Amazon EMR Spark to an Apache Iceberg catalog
GETTING STARTED Amazon EMR is an easy way to deploy distributed data processing frameworks like Apache Spark, Apache Flink, Apache…
-
Configuring Trino
GETTING STARTED Trino is a popular open-source distributed SQL query engine that federates queries against data stored in the Hive Metastore,…
-
Configuring Apache Spark
GETTING STARTED Apache Spark provides comprehensive support for Apache Iceberg via both extended SQL syntax and stored procedures to manage…
-
Connecting to a REST Catalog
GETTING STARTED The Apache Iceberg REST catalog protocol is a standard API for interacting with any Iceberg catalog. The REST…
-
Catalogs and the REST catalog
GETTING STARTED Catalogs in Apache Iceberg The core responsibility of Iceberg is to manage a collection of files as a…
-
Why Apache Iceberg — for data warehouse users
Major data warehouse platforms such as Google BigQuery, Snowflake, AWS, and Databricks have all announced support for Apache Iceberg tables. Commercial warehouse engines seldom…
-
Why Apache Iceberg — for data lake users
INTRODUCTION If you have been working in a data lake, you’re probably very familiar with its drawbacks. You’re in luck:…
-
Data engineering with Apache Iceberg
DATA ENGINEERING Data engineers starting at Netflix attend (or used to, at least) a few hours of orientation to become…
-
Using Hidden Partitioning
DATA ENGINEERING This recipe shows how to use Apache Iceberg’s hidden partitioning to improve query performance while avoiding data quality…