Unity Catalog
Overview
Unity Catalog is Databricks' unified governance solution for all data assets across your organization. It provides centralized access control, data discovery, data lineage tracking, and comprehensive audit logging in a single platform, eliminating data silos and governance complexity.
The 3-Level Namespace
Unity Catalog uses a hierarchical namespace structure to organize all data assets:
- Catalog — Top-level container, typically represents a business domain or environment (e.g.,
finance_prod, marketing_dev)
- Schema — Logical grouping within a catalog, organizes related tables and objects (e.g.,
transactions, dimensions)
- Table/Object — The actual data asset (tables, views, functions, volumes, models)
Full qualified name: catalog.schema.table (e.g., finance_prod.transactions.orders)
Key Benefits
- Centralized Access Control — Manage permissions hierarchically across all workspaces from a single place, eliminating workspace-level silos
- Data Lineage — Automatic tracking of table-to-table and column-level dependencies for impact analysis and compliance
- Data Discovery — Built-in search, tagging, and asset descriptions enable self-service data exploration across your organization
- Audit Logging — Comprehensive logs of all data access, modifications, and permission changes for regulatory compliance and security investigations
- Cross-Workspace Governance — Single source of truth for metadata and permissions, even across multiple workspaces in different regions
- Secure Collaboration — Fine-grained permissions enable secure data sharing with teams, partners, and service principals without copying data
Core Components
- Metastore — Cloud-backed metadata store (one per region) that manages all Unity Catalog objects and permissions
- Workspace Assignment — Each workspace connects to a single metastore; all workspaces using the same metastore share the same catalogs and permissions
- External Locations — References to cloud storage paths with associated credentials, used for creating external tables
- Storage Credentials — Cloud IAM credentials (AWS IAM roles, Azure service principals, GCP service accounts) that Unity Catalog uses to access external data
Getting Started
To begin using Unity Catalog:
- Enable at the account level (requires account-level admin)
- Create a metastore in your cloud region
- Assign the metastore to your workspace(s)
- Create catalogs and schemas to organize data
- Create tables or register existing tables with the catalog
- Configure permissions using roles and groups