Apache Iceberg – Page 3

Querying Table Metadata

December 28, 2023

Categories: Apache Iceberg, How to

BASICS This recipe shows how to inspect Apache Iceberg table metadata with SQL queries. Iceberg metadata Iceberg table metadata is…
READ MORE
Querying an Iceberg Table

December 28, 2023

Categories: Apache Iceberg, How to

BASICS This recipe demonstrates simple queries with Iceberg tables. Running queries in Apache Spark Spark supports two interfaces to query…
READ MORE
Connecting to Athena PySpark

December 28, 2023

Categories: Apache Iceberg, How to

GETTINGS STARTED Amazon Athena is a managed compute service that allows you to use SQL or PySpark to query data…
READ MORE
Connecting Amazon EMR Spark to an Apache Iceberg catalog

December 28, 2023

Categories: Apache Iceberg, How to

GETTING STARTED Amazon EMR is an easy way to deploy distributed data processing frameworks like Apache Spark, Apache Flink, Apache…
READ MORE
Configuring Trino

December 28, 2023

Categories: Apache Iceberg, How to

GETTING STARTED Trino is a popular open-source distributed SQL query engine that federates queries against data stored in the Hive Metastore,…
READ MORE
Configuring Apache Spark

December 28, 2023

Categories: Apache Iceberg, Education

GETTING STARTED Apache Spark provides comprehensive support for Apache Iceberg via both extended SQL syntax and stored procedures to manage…
READ MORE
Connecting to a REST Catalog

December 28, 2023

Categories: Apache Iceberg, How to

GETTING STARTED The Apache Iceberg REST catalog protocol is a standard API for interacting with any Iceberg catalog. The REST…
READ MORE
Catalogs and the REST catalog

December 28, 2023

Categories: Apache Iceberg, Education

GETTING STARTED Catalogs in Apache Iceberg The core responsibility of Iceberg is to manage a collection of files as a…
READ MORE
Why Apache Iceberg — for data warehouse users

December 28, 2023

Categories: Apache Iceberg, Opinion

Major data warehouse platforms such as Google BigQuery, Snowflake, AWS, and Databricks have all announced support for Apache Iceberg tables. Commercial warehouse engines seldom…
READ MORE
Why Apache Iceberg — for data lake users

December 28, 2023

Categories: Apache Iceberg, Opinion

INTRODUCTION If you have been working in a data lake, you’re probably very familiar with its drawbacks. You’re in luck:…
READ MORE
Data engineering with Apache Iceberg

December 27, 2023

Categories: Apache Iceberg, How to

DATA ENGINEERING Data engineers starting at Netflix attend (or used to, at least) a few hours of orientation to become…
READ MORE
Using Hidden Partitioning

December 27, 2023

Categories: Apache Iceberg, How to

DATA ENGINEERING This recipe shows how to use Apache Iceberg’s hidden partitioning to improve query performance while avoiding data quality…
READ MORE
Setting table write order

December 27, 2023

Categories: Apache Iceberg, How to

DATA ENGINEERING This recipe shows you how to set a table’s write order to instruct all writers — including background…
READ MORE
Using MERGE

December 27, 2023

Categories: Apache Iceberg, How to

DATA ENGINEERING Using MERGE One of the most useful tools that Iceberg enables is the SQL MERGE command. This recipe…
READ MORE
Creating Branches and Tags

December 27, 2023

Categories: Apache Iceberg, How to

DATA ENGINEERING This recipe shows how to create and manage tags and branches in an Apache Iceberg table. What are…
READ MORE