Configuring Trino

MENU – Apache Iceberg Cookbook

Apache Iceberg Cookbook

Introduction

Getting Started

Basics

Data Engineering

Pyiceberg

Data Operations

Migrating to Iceberg

Configuring Trino

GETTING STARTED

Trino is a popular open-source distributed SQL query engine that federates queries against data stored in the Hive Metastore, AWS Glue, Cassandra, MySQL, and many more.

Trino uses catalogs to connect to various data sources. A catalog is a named data source that is organized into schemas that contain tables. This recipe shows how to configure catalogs that support Iceberg tables in Trino.

The easiest way to get started using Trino is with the help of Docker. This ensures Trino runs isolated from the host system.

Connector configurations

Trino catalogs are created by a configuration file under the root Trino directory using the naming scheme etc/catalog/<catalog_name>.properties. The value used for <catalog_name> is the identifier for the catalog via SQL queries.

REST catalog

Trino’s REST catalog configuration allows Trino to communicate directly with an Iceberg catalog service using the Iceberg REST Catalog protocol. This protocol is the standard and recommended way to interact with Iceberg metadata and tables.

# etc/catalog/my_rest_catalog.properties

connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=https://iceberg-with-rest:8181/

Note that Trino configuration is strict. Not all Iceberg catalog configuration options are allowed in Trino, and Trino uses its own set of property names. Refer to the Trino documentation for more detail on what options are available.

Hive Metastore catalog

This configuration allows Trino to connect to an existing Hive Metastore and to store pointers to Iceberg metadata. There are a number of configuration options related to security and other Hive behaviors that can be defined here.

# etc/catalog/my_hms_catalog.properties

connector.name=iceberg
iceberg.catalog.type=hive_metastore
hive.metastore.uri=thrift://example.net:9083

AWS Glue catalog

The Glue catalog allows Trino to use Iceberg tables managed by AWS Glue.

# etc/catalog/my_glue_catalog.properties

connector.name=iceberg
iceberg.catalog.type=glue

Migration configurations

Trino also has features to help with migration from legacy catalogs like Hive to modern implementations like the Iceberg REST catalog. This can be configured on a legacy Hive connector by setting an Iceberg catalog name.

# File: hive.properties

connector.name=hive
hive.metastore=thrift
hive.iceberg-catalog-name=<iceberg_catalog_name>

hive.metastore.uri=thrift://example.net:9083

Trino will first attempt to load the table from the Iceberg catalog and fallback to loading the Hive table if one is not found. This feature allows for migrating tables without having to change how they are referenced in SQL queries and seamlessly upgrade your infrastructure.