Tabular : Built on an Apache Iceberg foundation

We developed a serverless storage platform on top of the leading open table format, complete with automation, ingestion, and security.

Apache Iceberg is an open table format that provides data warehouse guarantees — such as ACID transactions and SQL behavior — on top of cloud object storage (i.e. a data lake). Iceberg was created to make smarter storage platforms possible.

Tabular is a centralized storage platform that you can use with any compute engine. Tabular uses Iceberg tables as its foundation and is from Apache Iceberg’s original creators. The platform is a fast and affordable path to centralized data warehouse storage that can be paired with multiple specialized compute frameworks.

Tabular offers the following capabilities: 

Responsive table optimization

Tabular speeds queries and lowers costs by continuously optimizing each table as new data arrives. While Iceberg provides solid performance through its default configuration, we have used our extensive experience with Iceberg to automate table optimization for each table based on its data profile and query patterns. 

Centralized security

Tabular centralizes data security across multiple compute engines and frameworks. It includes centralized role-based access control (RBAC) enforced at the table or column level. Access controls are applied consistently across a variety of query methods, from Python scripts to Spark jobs to query engines such as Trino or Amazon Athena.

Integrated ingestion of files and CDC events

Tabular eliminates tedious and time-consuming scheduling, monitoring, and maintenance of ingestion jobs. Tabular provides UI or API-configurable ingestion from files and change data capture (CDC) to mirror relational databases. It supports stream ingestion via the Iceberg KafkaConnect sink or Iceberg Flink sink. Critically, Tabular’s responsive optimization re-organizes your data as it is ingested, so that you get optimal performance even when querying newly-arriving data. 

Simple SaaS or managed private deployment on AWS

Tabular runs as a managed service in our cloud account, or in a dedicated account in your AWS cloud, managed by Tabular. In either case, your data resides in your account, under your control.

Setup is straightforward. In just a few minutes you can have an enterprise-grade catalog, access controls, table maintenance, automatic optimization, and connections to multiple query engines.

Hassle-free operations

There is no day-to-day management of Tabular’s serverless infrastructure. Some of the mundane tasks you will no longer have to concern yourself with include:

  • Per-table performance tuning, file compaction and clustering
  • Table maintenance and garbage collection
  • Job management, sizing, or scaling

What Tabular adds to your Iceberg tables

CategoryFunctionalityTabular provides
Optimization
Table analysis and tuningTabular frequently analyzes each table’s data and query patterns to recommend settings to save money and make queries faster.
Responsive compaction and clusteringAutomatic compaction and clustering improves performance and cost. Data is optimized continuously so it is optimized before it is read.
Metadata optimizationAutomatic compaction and clustering of table metadata for fast scan planning.
Table maintenance
Snapshot expirationAutomatic deletion of expired snapshots.
Orphan file cleanupAutomatic deletion of orphan files.
Garbage collectionAutomatic cleanup of expired data and metadata files.
Data lifecycle management
Column maskingNullify data in a column, or hide the column from specific roles.
Record TTLAutomatically delete records once they reach a specified age.
Column TTLAutomatically remove or obfuscate column values once rows reach a specified age. 
Table restore (un-drop)Restore dropped tables, without losing data.
Ingestion
From storage buckets (files)The Tabular File Loader service automatically detects and ingests new files in a source storage location. The table is optimized as new data is ingested.
From databases (CDC)The Tabular CDC service mirrors database tables in Iceberg by ingesting change events from a log table and merging them to a target table. The table is optimized as new data is ingested.
Security
RBACTabular RBAC provides central control over permissions per database, table, or column. Permissions are applied across all writers and readers.
AWS IAMTabular accepts AWS IAM identities for authentication. 
SSOOkta, Google OIDC, Custom OIDC
Cross-domain identity and role managementOkta SCIM2, Custom SCIM2
Fine-grained access controlColumn-level labels are used to restrict access or mask columns.
Operations
SaaSTabular runs as a cloud-hosted managed service. Data is read and processed in Tabular’s account but always stored in your account.
Managed private deployment (AWS)Tabular deploys to a dedicated AWS account owned and controlled by the customer, and managed by Tabular .
Serverless operation Simple, declarative configuration. Requires zero infrastructure, orchestration, engineering, or maintenance.
MonitoringTabular metrics are available via OpenMetrics for building dashboards and alerts in 3rd-party tools such as Datadog or Grafana.
APIs
REST CatalogTabular uses Iceberg’s standard REST catalog protocol to connect compute services.
Tabular configurationProgrammatically manage Tabular resources and configuration. 
TerraformAPI calls and scripts available for provisioning via Terraform.