Apache Polaris
Apache Polaris is an open-source REST catalog for Apache Iceberg, originally built and open-sourced by Snowflake in 2024 and subsequently donated to the Apache Software Foundation. It implements the Iceberg REST Catalog specification and is rapidly becoming the standard answer to “what catalog do I run for Iceberg?” alongside AWS Glue and Nessie.
Key Features:
- Iceberg REST Catalog. Implements the open Iceberg REST spec, so any compliant engine (Spark, Trino, Flink, Snowflake, Dremio, StarRocks) can connect with no custom integration.
- Multi-Tenancy. Catalogs, namespaces, and tables with role-based access control; one Polaris instance can serve many isolated workspaces.
- Credential Vending. Polaris hands out short-lived, scoped cloud credentials to engines so they can read S3 / GCS / ADLS directly — no long-lived keys in client configs.
- Identity Provider Integration. OAuth, OIDC, SAML against existing corporate IdPs.
- External Catalog Federation. Can sit in front of AWS Glue, Hive Metastore, or another Iceberg REST catalog.
- Vendor-Neutral. Apache governance — no single-vendor lock-in despite Snowflake’s origin.
Architecture:
Polaris exposes the Iceberg REST API over HTTP, backed by a relational metastore (Postgres / managed Snowflake) for catalog state. Engines authenticate, list namespaces, and request table metadata pointers; Polaris then vends temporary credentials scoped to the requested table’s storage location. The actual data reads/writes happen directly between the engine and object storage.
Use Cases:
- Single open catalog across heterogeneous engines (Snowflake + Spark + Trino) on the same Iceberg tables.
- Replacing Hive Metastore or AWS Glue with a vendor-neutral, RBAC-aware catalog.
- Multi-tenant data platforms that need workspace isolation and auditable access.
- Organizations standardizing on Iceberg for their open lakehouse foundation.