Tabular File Loader

File Loader ingests data files in near-real-time into Apache Iceberg tables. Tabular monitors a location in S3 for new files and manages writing new data into tables while optimizing the table and handling schema evolution. 

Features include:

  • Schema inference and evolution. New columns are automatically added to the target table, field types are automatically inferred, and dropped columns are retained. 
  • Table-specific optimizations are applied during ingestion, based on analysis (look at other page). These improvements can reduce the size of the data — and query speed and cost – by up to 80% 
  • Serverless operation –  pipelines are based on simple, declarative configuration. They require zero infrastructure, orchestration, engineering, or maintenance.
  • Parquet, CSV, TSV, JSON, and XML file format support, including complex data structures such as nested fields and arrays
  • Exactly-once semantics eliminates hand-crafting checkpoints and dedupe jobs.

Other things you should know about File Loader:

  • UI or API based configuration (e.g. using Terraform)
  • Tabular RBAC restricts ingestion to users with proper permissions.
  • Observable pipelines – Tabular supports the Open Metrics API to expose ingestion activity to 3rd-party observability tools. You can monitor and alert on pipeline latency and errors,
  • Transparent, predictable pricingPricing is pay-as-you-go and based solely on the volume of source data being ingested.  Usage can be monitored within Tabular.

Watch File Loader demo