Comparing Tabular to Apache Iceberg

Apache Iceberg is a modern table format that provides for ACID transactions and SQL behavior on a data lake. Tabular was built by the original creators of Iceberg to provide the fastest path to an enterprise-grade storage platform built on Iceberg tables. It offers the following enhancements to open source Iceberg.

Performance optimization

Get substantial performance improvements. While Iceberg provides solid performance through its default configuration, the Tabular Optimizer speeds your queries and lowers your costs by dynamically tuning each table. The Optimizer codifies a decade of Apache Iceberg know-how into a rule set that continually optimizes compaction and compression settings based on each table’s data profile and query patterns.

Centralized RBAC

Harmonize data security across all compute engines and frameworks. Tabular includes centralized role-based access control (RBAC) enforced at the database, table, or column level. It enforces access control for any query method, from a custom Python script to Spark jobs to query engines such as Trino / Amazon Athena.

Integrated data pipelines

Eliminate the tedious engineering of create pipelines to load data. Tabular provides UI or API-configurable ingestion from files via Tabular File Loader or change data capture (CDC) for mirroring relational databases. Uniquely, Tabular optimizes your data as it is ingested.

SaaS or managed private cloud deployment

The Tabular catalog and data, ingestion and RBAC services run as a dedicated instance in our cloud, connected via private peering to data in your cloud account. Attach query engines in minutes using configurable connectors for Amazon Athena, Google BigQuery, Spark / EMR, Trino and more. You can be connected and running in minutes.

Serverless operation

Make day-to-day management painless. Tabular provides self-service deployment, auto-clustering and auto-scaling. It maintains your data through automated garbage collection. It provides monitoring data compliant with the OpenMetrics API for use with 3rd-party systems such as AWS CloudWatch and DataDog.

What Tabular Adds to your Iceberg Tables

Category	Function	What Tabular provides
Optimization
	Compaction	Dynamic, table-specific optimization of file compaction jobs for substantial performance gains.
	Metadata management	Automatic, dynamic compaction of metadata files for best performance. Automatic deletion of metadata files that are no longer required.
Table maintenance
	Snapshot expiration	Automatic deletion of expired snapshots.
	Orphan file cleanup	Automatic, safe deletion of orphan files.
Ingestion
	From storage buckets (files)	File Loader service automatically detects and ingests new files in a source storage bucket. The table is optimized as new data is ingested.
	From databases (CDC)	CDC service mirrors database tables in Iceberg by ingesting change events to a log table and merging them to a target table. The table is optimized as new data is ingested.
Security
	RBAC	RBAC service provides central control over permissions per database, table or column. Permissions are applied across all writers and readers.
	AWS IAM	Tabular accepts AWS IAM identities for authentication.
	SSO	Okta, Google OIDC, Custom OIDC
	Cross-domain identity management	Okta SCIM2, Custom SCIM2
	Fine-grained access control	Column-level labels are used to restrict access or mask columns.
Operations
	SaaS	Tabular runs as a cloud-hosted managed service. Data is read and processed in Tabular’s account but only written back to the customer’s account.
	Managed private cloud on AWS	Tabular deploys to an AWS account controlled by the customer, that interacts with the customer’s data without leaving the customer’s control.
	Scaling	Automatic (serverless)
	Iceberg Updates	Automatic
	Monitoring	Exposure of Iceberg metrics via OpenMetrics standard for use in 3rd-party tools.
APIs
	Configuration	Programmatically manage Tabular resources and configuration. OpenAPI compliant.
	REST Catalog	Tabular provide a catalog that follows the Iceberg REST Open API specification.
	Terraform	API calls and scripts available for provisioning via Terraform.