Databricks Mosaic AI

Mosaic AI is Databricks' generative AI platform — the integrated set of services for building, evaluating, serving, and governing GenAI applications directly on the Lakehouse. Rather than stitching together a vector database, an LLM gateway, an evaluation harness, and a separate orchestration framework, Mosaic AI ships these as cohesive products that share Unity Catalog identity, lineage, and access control.

The core building blocks are:

Vector Search — managed similarity search built on Delta tables, with automatic sync from source data and UC-governed retrieval.
Model Serving — low-latency REST endpoints for foundation models, fine-tuned models, and custom Python models, with autoscaling and request logging into Inference Tables.
Foundation Model APIs — pay-per-token access to Llama 3, DBRX, Mixtral, Claude, and embedding models without provisioning capacity.
Agent Framework — an opinionated SDK and evaluation harness for building, tracing, and shipping LLM agents that run as production endpoints.
AI Playground — a workspace-native chat UI for prompt iteration, model comparison, and quick prototyping that exports straight to notebook code.
AI Functions — SQL functions like ai_query, ai_classify, and ai_summarize that call serving endpoints from inside warehouse queries.

Pages in this section

Mosaic AI Vector Search — managed vector index over Delta tables: index types, embedding options, hybrid search, and the RAG retrieval pattern.

Coming next

Natural extensions to this section, planned as follow-on pages:

Agent Framework — building tool-calling agents with mlflow.deployments, ChatAgent, and the Agent Evaluation harness; deploying as a Mosaic AI Agent endpoint.
Model Serving — provisioned-throughput vs pay-per-token, custom Python model deployment, Inference Tables for request logging, and traffic splitting for A/B tests.