Skip to main content

Ollama Integration

Add governance to locally-hosted Ollama models. Control Zero intercepts Ollama API calls through the Governance Gateway and enforces your dashboard policies on every request.

Overview

Ollama runs open-source LLMs locally. When routed through the Control Zero Governance Gateway, every chat completion and model interaction is governed by your policies and logged for audit.

Ollama uses an OpenAI-compatible wire format, so the integration reuses the same interceptors and policy enforcement as the OpenAI integration.

Setup

1. Configure the Gateway

Enable the Ollama provider in your Governance Gateway configuration:

# Environment variables for the gateway
CZ_GATEWAY_OLLAMA_ENABLED=true
CZ_GATEWAY_OLLAMA_API_URL=http://localhost:11434/v1

2. Route Requests Through the Gateway

Instead of calling Ollama directly, point your client at the gateway's Ollama endpoint:

import openai

# Route through Control Zero gateway instead of direct Ollama
client = openai.OpenAI(
base_url="http://localhost:8000/ollama/v1",
api_key="cz_live_your_key_here",
)

response = client.chat.completions.create(
model="llama3",
messages=[{"role": "user", "content": "Summarize this quarter's sales data."}],
)
print(response.choices[0].message.content)

3. Define Policies

Create policies in the dashboard to govern Ollama usage:

{
"name": "ollama-production-rules",
"rules": [
{
"effect": "allow",
"action": "llm:chat.completions",
"resource": "ollama/*",
"conditions": { "model": ["llama3", "mistral"] }
},
{
"effect": "deny",
"action": "llm:chat.completions",
"resource": "ollama/*",
"conditions": { "model": ["*-uncensored"] }
}
]
}

What Gets Governed

API EndpointPolicy ActionDescription
POST /ollama/v1/chat/completionsllm:chat.completionsChat completions (streaming and non-streaming)
POST /ollama/v1/completionsllm:completionsText completions
GET /ollama/v1/modelsllm:models.listModel listing

Streaming Support

The gateway fully supports streaming responses from Ollama. Server-sent events are forwarded transparently while policy enforcement and audit logging occur on the initial request.

Audit Logging

All Ollama requests routed through the gateway are logged in the audit trail with the provider identified as ollama. View them in the dashboard under Audit Log with the provider filter.

Tool Calling

Tool-calling via the OpenAI-compatible endpoint is supported for llama3.1+, mistral-nemo, and other tool-capable Ollama models. The gateway passes tool_calls through transparently, and every tool call is evaluated against your policies just like any other provider.

Limitations

  • The gateway proxies to a single Ollama instance per configuration. For multiple Ollama hosts, run multiple gateway instances.