
Using Spark in EMR with Apache Iceberg
This post walks through setting up an AWS EMR cluster with Spark and Iceberg tables.
Spark, EMR, & IcebergThis post walks through setting up an AWS EMR cluster with Spark and Iceberg tables.
Spark, EMR, & IcebergThis post details how Iceberg’s metadata forms an index that Iceberg uses to scale to hundreds of petabytes in a single table and to quickly find matching data, even on a single node.
Metadata IndexingTo be successful, the Iceberg community needs to be collaborative, neutral, and independent. To ensure that it continues to be a healthy and successful community, we're creating an open source culture at Tabular that sets best practices for working in the community.
Iceberg CommunityIntroducing Tabular. We think data engineers, analysts, and scientists are most effective when they can focus on building reliable data products and answering questions, not fighting infrastructure. We’re creating a new data platform that automates maintenance and optimization to protect people from tedious concerns, while enabling them to work with the same data simultaneously across Spark, Trino, Flink, and other engines. The core of Tabular is the open source standard for huge analytic tables, Apache Iceberg.
Introducing Tabular