Databricks Mosaic AI
Mosaic AI is Databricks' generative AI platform — the integrated set of services for building, evaluating, serving, and governing GenAI applications directly on the Lakehouse. Rather than stitching together a vector database, an LLM gateway, an evaluation harness, and a separate orchestration framework, Mosaic AI ships these as cohesive products that share Unity Catalog identity, lineage, and access control.
The core building blocks are:
- Vector Search — managed similarity search built on Delta tables, with automatic sync from source data and UC-governed retrieval.
- Model Serving — low-latency REST endpoints for foundation models, fine-tuned models, and custom Python models, with autoscaling and request logging into Inference Tables.
- Foundation Model APIs — pay-per-token access to Llama 3, DBRX, Mixtral, Claude, and embedding models without provisioning capacity.
- Agent Framework — an opinionated SDK and evaluation harness for building, tracing, and shipping LLM agents that run as production endpoints.
- AI Playground — a workspace-native chat UI for prompt iteration, model comparison, and quick prototyping that exports straight to notebook code.
- AI Functions — SQL functions like
ai_query, ai_classify, and ai_summarize that call serving endpoints from inside warehouse queries.
Pages in this section
- Mosaic AI Vector Search — managed vector index over Delta tables: index types, embedding options, hybrid search, and the RAG retrieval pattern.
Coming next
Natural extensions to this section, planned as follow-on pages:
- Agent Framework — building tool-calling agents with
mlflow.deployments, ChatAgent, and the Agent Evaluation harness; deploying as a Mosaic AI Agent endpoint.
- Model Serving — provisioned-throughput vs pay-per-token, custom Python model deployment, Inference Tables for request logging, and traffic splitting for A/B tests.