Snowflake Cortex

Cortex is Snowflake's in-account generative-AI surface: a catalog of foundation-model functions (COMPLETE, EMBED_TEXT, SUMMARIZE, etc.) plus three higher-level managed services that compose them into production assistants.

Cortex Analyst

Text-to-SQL grounded in a YAML semantic model with a verified-query store. Lets business users ask natural-language questions and get accurate SQL answers.

Cortex Search

Managed hybrid (vector + lexical) search service over a Snowflake table. The retrieval layer behind RAG pipelines — no separate vector DB required.

Cortex Agents

Orchestration runtime that composes Cortex Search, Cortex Analyst, and custom HTTP tools into multi-step assistants with citations and tool-use loops.

Pretraining & Fine-Tuning

End-to-end pipeline for adapting foundation models inside Snowflake — data prep, training jobs, model registry, and serving without leaving the account.

Cortex Functions

Catalog of SQL-callable LLM primitives: COMPLETE, EMBED_TEXT, SUMMARIZE, CLASSIFY_TEXT, EXTRACT_ANSWER, TRANSLATE, and more.

Data Preparation

Extraction, cleaning, transformation, and feature engineering inside Snowflake — the foundation for any Cortex training or inference workload.

Model Training

Snowflake ML for tabular models, Cortex fine-tuning for LLMs. Prompt templates, training data shaping, and the Cortex.FineTune workflow.

Model Deployment

Cortex endpoints, resource allocation, and access controls. How a fine-tuned model becomes a callable inference service inside Snowflake.

Inference & Monitoring

Cortex.Invoke for prediction, plus logging, latency tracking, data-drift detection, and feedback loops for production observability.


Snowflake AI_COMPLETE / Cortex Commands

Snowflake Cortex AI Development Pipeline

Snowflake Cortex provides a platform for building and deploying generative AI models directly within Snowflake. This document outlines a typical development pipeline, including data preparation, model training, deployment, and monitoring. The pipeline leverages Snowflake's functionalities and Cortex's capabilities for seamless integration.

1. Data Preparation

The foundation of any AI model is high-quality data. This phase involves data extraction, cleaning, transformation, and feature engineering within Snowflake.

Sample Code (SQL - Snowflake)

-- Example: Extract and clean data from a Snowflake table
CREATE OR REPLACE TEMPORARY TABLE cleaned_data AS
SELECT
    lower(text_column) AS cleaned_text, -- Convert to lowercase
    CASE
        WHEN length(text_column) > 1000 THEN null -- Remove long texts
        ELSE text_column
    END AS trimmed_text
FROM
    raw_data_table
WHERE
    text_column IS NOT NULL
    AND length(text_column) > 0;

2. Model Training (using Snowflake ML or Cortex Functions)

Model training can be performed either using Snowflake ML (for more general machine learning tasks) or, more commonly for generative AI, utilizing Cortex Functions.

Sample Code (Cortex Function - Python - within Snowflake)

This example uses a simple prompt template. Real-world fine-tuning will require significantly more complex code and datasets.

-- Create a prompt template
CREATE OR REPLACE FUNCTION prompt_template(input_text VARCHAR)
RETURNS VARCHAR
LANGUAGE PYTHON
AS
$$
    prompt = f"""
    You are a helpful assistant.  Respond to the user's question:

    User: {input_text}
    Assistant:
    """
    return prompt;
$$;

-- Use the prompt template
SELECT prompt_template('What is the capital of France?');

-- To fine-tune a model, this prompt template would be incorporated into the training data and used with the
-- Cortex.FineTune function.  The actual fine-tuning command is more complex and depends on the specific LLM.

-- Example:
   Cortex.FineTune(
      base_model_name = 'mistralai/Mistral-7B-Instruct-v0.1',
      training_data = (
                        SELECT prompt_template(input_text) AS prompt, 
                        input_text AS completion FROM training_data_table
                       ),
--   ...
-- )

3. Model Deployment

Once the model is trained, it needs to be deployed to make it accessible for inference requests.

Sample Code (SQL - Snowflake) - Illustrative

-- Assuming you have a fine-tuned model 'my_fine_tuned_model'
-- (This is simplified, actual deployment steps are more involved)

-- This is a conceptual illustration.  Actual deployment involves:
-- 1. Creating a Cortex Model resource.
-- 2. Uploading the trained model files to Snowflake.
-- 3. Creating a Cortex Endpoint linked to the model.

-- Example (Requires a fine-tuned model 'my_fine_tuned_model'):
   Cortex.CreateEndpoint(
         model_name = 'my_fine_tuned_model',
         resource_allocation = 'SMALL'
   )

4. Model Inference (Prediction)

This stage involves sending requests to the deployed endpoint and receiving predictions.

Sample Code (SQL - Snowflake)

-- Example: Send a prediction request to the Cortex endpoint (Illustrative)

-- Assuming 'my_endpoint' is the name of your deployed Cortex Endpoint

SELECT
    Cortex.Invoke(
        endpoint_name = 'my_endpoint',
        payload = '{"input": "Translate to French: Hello, world!"}'
    );

-- Response from Cortex.Invoke will be a JSON string containing the model's response.
-- Further parsing may be required to extract the specific data.

5. Model Monitoring & Evaluation

Continuous monitoring is crucial for ensuring model performance and identifying potential issues.

Sample Code (SQL - Snowflake - Monitoring - Illustrative)

-- This is a simplified example - actual monitoring involves more sophisticated techniques
-- to track performance, cost, and data drift.

-- Create a table to log Cortex.Invoke requests and responses
CREATE OR REPLACE TABLE cortex_inference_logs (
    request_time TIMESTAMP,
    endpoint_name VARCHAR,
    payload VARCHAR,
    response VARCHAR
);

-- Log the inference requests and responses
INSERT INTO cortex_inference_logs (request_time, endpoint_name, payload, response)
SELECT
    CURRENT_TIMESTAMP(),
    'my_endpoint',
    Cortex.Invoke(
        endpoint_name = 'my_endpoint',
        payload = '{"input": "What is the capital of Germany?"}'
    ),
    Cortex.Invoke(
        endpoint_name = 'my_endpoint',
        payload = '{"input": "What is the capital of Germany?"}'
    );

-- Analyze the logs to identify trends and potential issues.
-- Example:  Average inference time
SELECT AVG(length(response)) FROM cortex_inference_logs;

Key Considerations

This pipeline provides a foundational understanding of developing AI models within Snowflake Cortex. Refer to the official Snowflake documentation for comprehensive details and advanced features.