Share feedback
Answers are generated based on the documentation.

Models

Models are the AI brains behind your agents. docker-agent supports multiple providers and flexible configuration.

Inline vs. Named Models

There are two ways to assign a model to an agent:

Inline (Quick)

Use the provider/model shorthand directly in the agent definition:

agents:
  root:
    model: openai/gpt-5
    instruction: You are a helpful assistant.

Named (Full Control)

Define models in a models section and reference them by name:

models:
  claude:
    provider: anthropic
    model: claude-sonnet-4-5
    max_tokens: 64000
    temperature: 0.7

agents:
  root:
    model: claude
    instruction: You are a helpful assistant.

Named models let you configure temperature, token limits, thinking budgets, and other parameters. They're also reusable across multiple agents.

First Available Models

A named model can also select the first usable model from a priority list. This is useful for shared configs that should prefer paid cloud models when their API keys are present, but still work with a local fallback:

models:
  smart:
    first_available:
      - anthropic/claude-sonnet-4-5
      - openai/gpt-5
      - dmr/ai/qwen3

agents:
  root:
    model: smart
    instruction: You are a helpful assistant.

At load time, docker-agent selects the first candidate whose credentials are configured. You only need credentials for one candidate. See Model Configuration for details.

Supported Providers

ProviderKeyExample ModelsAPI Key Env Var
OpenAIopenaigpt-5, gpt-5-mini, gpt-4oOPENAI_API_KEY
Anthropicanthropicclaude-sonnet-4-5, claude-opus-4-7ANTHROPIC_API_KEY
Googlegooglegemini-3.5-flash, gemini-3-proGOOGLE_API_KEY / GEMINI_API_KEY
AWS Bedrockamazon-bedrockClaude, Nova, Llama modelsAWS credentials
Docker Model Runnerdmrai/qwen3, ai/llama3.2None (local)
MistralmistralMistral modelsMISTRAL_API_KEY
xAIxaiGrok modelsXAI_API_KEY
NebiusnebiusOpen-source and specialised modelsNEBIUS_API_KEY
MiniMaxminimaxMiniMax modelsMINIMAX_API_KEY
BasetenbasetenDeepSeek, Kimi, GLM, Llama modelsBASETEN_API_KEY
OVHcloudovhcloudQwen, Llama, Mistral, DeepSeek (EU-hosted)OVH_AI_ENDPOINTS_ACCESS_TOKEN
GroqgroqLlama, Qwen, GPT-OSS (fast inference)GROQ_API_KEY
Fireworks AIfireworksKimi, Llama, Qwen, DeepSeek, GLM (open models)FIREWORKS_API_KEY
DeepSeekdeepseekDeepSeek-V3 chat and R1 reasonerDEEPSEEK_API_KEY
CerebrascerebrasGPT-OSS, GLM (fast inference)CEREBRAS_API_KEY
Together AItogetherLlama, Qwen, DeepSeek, Kimi (open models)TOGETHER_API_KEY
Hugging FacehuggingfaceLlama, Qwen, DeepSeek, GLM (open models)HF_TOKEN
Cloudflare Workers AIcloudflare-workers-aiLlama, Mistral, Qwen, Gemma (edge-hosted open models)CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID
Moonshot AImoonshotKimi K2 chat, reasoning, and coding modelsMOONSHOT_API_KEY
Vercel AI GatewayvercelMulti-provider gatewayAI_GATEWAY_API_KEY
Cloudflare AI Gatewaycloudflare-ai-gatewayMulti-provider gatewayCLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID + CLOUDFLARE_GATEWAY_ID
RequestyrequestyMulti-provider gatewayREQUESTY_API_KEY
OpenRouteropenrouterMulti-provider gatewayOPENROUTER_API_KEY
Azure OpenAIazuregpt-4o, gpt-5 on AzureAZURE_API_KEY + base_url
OllamaollamaAny local Ollama modelNone (local; optional base_url)
GitHub Copilotgithub-copilotCopilot-hosted OpenAI/AnthropicGITHUB_TOKEN (PAT with copilot)

See the Model Providers section for detailed configuration guides.

Model Properties

PropertyTypeDescription
providerstringProvider identifier (required)
modelstringModel name (required)
temperaturefloatRandomness: 0.0 (deterministic) to 1.0 (creative)
max_tokensintMaximum response length
top_pfloatNucleus sampling: 0.0 to 1.0
frequency_penaltyfloatReduce repetition: 0.0 to 2.0
presence_penaltyfloatEncourage topic diversity: 0.0 to 2.0
base_urlstringCustom API endpoint
thinking_budgetstring/intReasoning effort configuration
task_budgetint/objectTotal token budget for an agentic task (Anthropic; honored by Opus 4.7 today)
provider_optsobjectProvider-specific options

Reasoning / Thinking Budget

Control how much the model "thinks" before responding:

ProviderFormatValuesDefault
OpenAIstringminimal, low, medium, high, xhighmedium (always-reasoning models only)
Anthropicint or str1024–32768 tokens, or adaptive, adaptive/<effort>, effort leveloff
Gemini 2.5int0 (off), -1 (dynamic), or token count-1 (dynamic)
Gemini 3stringminimal, low, medium, highvaries
Allstring/intnone or 0 to disable
models:
  deep-thinker:
    provider: anthropic
    model: claude-sonnet-4-5
    thinking_budget: 16384

  fast-responder:
    provider: openai
    model: gpt-5
    thinking_budget: none # disable thinking
Note

Multi-provider teams

Different agents can use different providers in the same config. See Multi-Agent for patterns.

Alloy Models

"Alloy models" let you use more than one model in the same conversation — docker-agent alternates between them to leverage the strengths of each:

agents:
  root:
    model: anthropic/claude-sonnet-4-5,openai/gpt-5
    instruction: You are a helpful assistant.

Read more about the alloy model concept at xbow.com/blog/alloy-agents.