Guide

Tabular: a no-nonsense fact sheet

What is Tabular?

Tabular is a managed cloud storage platform built on Apache Iceberg, an open source table format for big data.

Tabular’s founders Ryan Blue and Daniel Weeks were the original creators of Iceberg.

The Iceberg format provides SQL behavior and ACID guarantees on top of a cloud object store like Amazon S3 or Google Cloud Storage.  It supports concurrent use of multiple query engines.

The Tabular platform provides an Iceberg-compatible REST catalog and services for data ingestion, CDC, continuous Iceberg table optimization, and security.

Tabular has no native query engine. Rather, it supports multiple compute methods, including Amazon Athena, Amazon Redshift, Amazon EMR, Snowflake and Starburst Galaxy, and open source options like Trino, Spark and Python. 

Where does Tabular fit in my architecture?

Tabular is serverless. It can be accessed as SaaS from Tabular’s AWS account or as a Tabular-managed service deployed into the customer’s AWS account.

Tabular manages Iceberg tables that reside in your chosen AWS S3 bucket(s). It loads data, optimizes storage per-table for cost and performance, and enforces access to files (RBAC).   

It provides the following services:

As shown below, Tabular does not sit in the data flow for queries. It optimizes your file system as background tasks, and accepts or rejects requests to access Iceberg tables based on a user’s authenticated identity, role and your RBAC policies. 

What value do I get from using Tabular as my storage engine?

Tabular reduces the cost, timeframe and risk of implementing Iceberg, whether you’re migrating from an existing data lake or starting fresh.  It also make data consumers happy by speeding up queries.

  • Faster queries, reduced costs – per-table performance optimization – our customers see improved query response rates and reduced storage costs of up to 80% vs. managing Iceberg themselves.
  • Easy on data engineering – automated table optimization and maintenance (compaction, sort-order, vacuuming, etc.).
  • Serverless operation – deploys easily, auto-scales and automatically updates
  • Turnkey data ingestion – create file ingestion and database table mirroring (CDC) pipelines, and ingest streaming events using the open source Iceberg Kafka Connect sink, which Tabular wrote.
  • Centralized access control – set column, table or warehouse policies that apply across compute methods.
  • Data security – your data stays in your cloud account, whether you use our SaaS or managed private deployment option. We are SOC2 Type 2, PCI-DSS and HIPAA certified.

What problems can Tabular solve?

Some of the problems we help our customers overcome include:

  • Struggling to build an Iceberg-based multi-compute data lakehouse
  • Insufficient data engineering or devops resources to build and manage Iceberg
  • Slow query performance
  • Excessive storage and query costs
  • Have a unique situation which requires advanced Iceberg expertise
  • Challenged to harmonize security policies across multiple compute methods

Who uses Tabular?

We have customers across a number of industries. They include:

  • A top 10 gaming company
  • A top 20 financial technology company
  • A top 10 real estate marketplace

Where can I go deeper?

  1. Video walk through
  2. Case studies page
  3. Video demos / Storylane