We developed a serverless storage platform on top of the leading open table format, complete with automation, ingestion, and security.
Apache Iceberg is an open table format that provides data warehouse guarantees — such as ACID transactions and SQL behavior — on top of cloud object storage (i.e. a data lake). Iceberg was created to make smarter storage platforms possible.
Tabular is a centralized storage platform that you can use with any compute engine. Tabular uses Iceberg tables as its foundation and is from Apache Iceberg’s original creators. The platform is a fast and affordable path to centralized data warehouse storage that can be paired with multiple specialized compute frameworks.
Tabular offers the following capabilities:
Responsive table optimization
Tabular speeds queries and lowers costs by continuously optimizing each table as new data arrives. While Iceberg provides solid performance through its default configuration, we have used our extensive experience with Iceberg to automate table optimization for each table based on its data profile and query patterns.
Centralized security
Tabular centralizes data security across multiple compute engines and frameworks. It includes centralized role-based access control (RBAC) enforced at the table or column level. Access controls are applied consistently across a variety of query methods, from Python scripts to Spark jobs to query engines such as Trino or Amazon Athena.
Integrated ingestion of files and CDC events
Tabular eliminates tedious and time-consuming scheduling, monitoring, and maintenance of ingestion jobs. Tabular provides UI or API-configurable ingestion from files and change data capture (CDC) to mirror relational databases. It supports stream ingestion via the Iceberg KafkaConnect sink or Iceberg Flink sink. Critically, Tabular’s responsive optimization re-organizes your data as it is ingested, so that you get optimal performance even when querying newly-arriving data.
Simple SaaS or managed private deployment on AWS
Tabular runs as a managed service in our cloud account, or in a dedicated account in your AWS cloud, managed by Tabular. In either case, your data resides in your account, under your control.
Setup is straightforward. In just a few minutes you can have an enterprise-grade catalog, access controls, table maintenance, automatic optimization, and connections to multiple query engines.
Hassle-free operations
There is no day-to-day management of Tabular’s serverless infrastructure. Some of the mundane tasks you will no longer have to concern yourself with include:
- Per-table performance tuning, file compaction and clustering
- Table maintenance and garbage collection
- Job management, sizing, or scaling
What Tabular adds to your Iceberg tables
Category | Functionality | Tabular provides |
---|---|---|
Optimization | ||
Table analysis and tuning | Tabular frequently analyzes each table’s data and query patterns to recommend settings to save money and make queries faster. | |
Responsive compaction and clustering | Automatic compaction and clustering improves performance and cost. Data is optimized continuously so it is optimized before it is read. | |
Metadata optimization | Automatic compaction and clustering of table metadata for fast scan planning. | |
Table maintenance | ||
Snapshot expiration | Automatic deletion of expired snapshots. | |
Orphan file cleanup | Automatic deletion of orphan files. | |
Garbage collection | Automatic cleanup of expired data and metadata files. | |
Data lifecycle management | ||
Column masking | Nullify data in a column, or hide the column from specific roles. | |
Record TTL | Automatically delete records once they reach a specified age. | |
Column TTL | Automatically remove or obfuscate column values once rows reach a specified age. | |
Table restore (un-drop) | Restore dropped tables, without losing data. | |
Ingestion | ||
From storage buckets (files) | The Tabular File Loader service automatically detects and ingests new files in a source storage location. The table is optimized as new data is ingested. | |
From databases (CDC) | The Tabular CDC service mirrors database tables in Iceberg by ingesting change events from a log table and merging them to a target table. The table is optimized as new data is ingested. | |
Security | ||
RBAC | Tabular RBAC provides central control over permissions per database, table, or column. Permissions are applied across all writers and readers. | |
AWS IAM | Tabular accepts AWS IAM identities for authentication. | |
SSO | Okta, Google OIDC, Custom OIDC | |
Cross-domain identity and role management | Okta SCIM2, Custom SCIM2 | |
Fine-grained access control | Column-level labels are used to restrict access or mask columns. | |
Operations | ||
SaaS | Tabular runs as a cloud-hosted managed service. Data is read and processed in Tabular’s account but always stored in your account. | |
Managed private deployment (AWS) | Tabular deploys to a dedicated AWS account owned and controlled by the customer, and managed by Tabular . | |
Serverless operation | Simple, declarative configuration. Requires zero infrastructure, orchestration, engineering, or maintenance. | |
Monitoring | Tabular metrics are available via OpenMetrics for building dashboards and alerts in 3rd-party tools such as Datadog or Grafana. | |
APIs | ||
REST Catalog | Tabular uses Iceberg’s standard REST catalog protocol to connect compute services. | |
Tabular configuration | Programmatically manage Tabular resources and configuration. | |
Terraform | API calls and scripts available for provisioning via Terraform. |