Share feedback
Answers are generated based on the documentation.

Anthropic

Use Claude Sonnet 4, Claude Sonnet 4.5, and other Anthropic models with docker-agent.

Setup

# Set your API key
export ANTHROPIC_API_KEY="sk-ant-..."

Workload Identity Federation (no API key)

Authenticate with short-lived tokens minted from your own OIDC identity provider instead of a long-lived API key. See Anthropic's Workload Identity Federation guide to provision a Federation Rule, then configure docker-agent with a typed auth: block:

providers:
  anthropic-wif:
    provider: anthropic
    auth:
      type: workload_identity_federation
      workload_identity_federation:
        federation_rule_id: fdrl_REPLACE_ME
        organization_id: 00000000-0000-0000-0000-000000000000
        # Optional: only required for target_type=SERVICE_ACCOUNT rules.
        service_account_id: svac_REPLACE_ME
        identity_token:
          # Pick exactly one of: file, env, command, url
          file: /var/run/secrets/anthropic.com/token

models:
  claude:
    provider: anthropic-wif
    model: claude-sonnet-4-5

identity_token accepts four mutually exclusive sources:

SourceWhen to use
fileKubernetes projected service-account tokens, SPIFFE/SPIRE helpers, Vault sidecars — anything that rotates a file on disk
envThe token is already exported in an environment variable
commandShell out to a CLI on every refresh (gcloud auth print-identity-token, az account get-access-token, ...)
urlFetch from an HTTP(S) endpoint (cloud metadata servers, GitHub Actions OIDC token URL, ...)

For url, both the URL and any header values support ${env.VAR} expansion against the runtime environment (the legacy ${VAR} form is also accepted), which lets you wire the GitHub Actions OIDC token endpoint without putting secrets in YAML:

identity_token:
  url: ${env.ACTIONS_ID_TOKEN_REQUEST_URL}&audience=https://api.anthropic.com
  headers:
    Authorization: bearer ${env.ACTIONS_ID_TOKEN_REQUEST_TOKEN}
  response_field: value

auth: is mutually exclusive with --gateway. Token-refresh failures are surfaced through the normal error path with a clear anthropic workload identity federation: failed to refresh identity token from <kind> source (federation_rule=fdrl_…): ... message in the TUI.

A complete walkthrough of all four sources lives in examples/anthropic_wif.yaml.

Configuration

Inline

agents:
  root:
    model: anthropic/claude-sonnet-4-5

Named Model

models:
  claude:
    provider: anthropic
    model: claude-sonnet-4-5
    max_tokens: 64000

Available Models

Model IDDescription
claude-opus-4-7Highest-capability Opus model; supports task budget
claude-sonnet-4-5Most capable Sonnet; supports extended thinking
claude-sonnet-4-0Previous Sonnet generation, still supported
claude-haiku-4-5Fast and inexpensive, good for tight loops

Thinking Budget

Anthropic accepts either an integer token budget or a string effort value. Thinking is off unless you set thinking_budget; when set, interleaved thinking is auto-enabled.

Token budget (1024–32768; works on all extended-thinking Claude models):

models:
  claude-deep:
    provider: anthropic
    model: claude-sonnet-4-5
    thinking_budget: 16384 # must be < max_tokens

Adaptive / effort-based (Claude Opus 4.6+, Sonnet 4.6 — every string value is sent as adaptive thinking via output_config.effort):

models:
  opus-adaptive:
    provider: anthropic
    model: claude-opus-4-6
    thinking_budget: adaptive # model decides effort (defaults to high)

  opus-effort:
    provider: anthropic
    model: claude-opus-4-6
    thinking_budget: high # low | medium | high | xhigh | max (same as adaptive/<effort>)

On models that reject token-based thinking (Opus 4.6, 4.7, 4.8, Sonnet 4.6), an integer budget is automatically coerced to adaptive with a logged warning. See the Thinking / Reasoning guide for the full cross-provider reference.

Interleaved Thinking

Auto-enabled whenever a thinking budget is configured on a Claude model. Allows tool calls during model reasoning for more integrated problem-solving:

models:
  claude:
    provider: anthropic
    model: claude-sonnet-4-5
    provider_opts:
      interleaved_thinking: false # disable if needed

Task Budget

task_budget caps the total number of tokens the model may spend across a multi-step agentic task — combined thinking, tool calls, and final output. It is forwarded as output_config.task_budget and is ideal for letting long-running agents self-regulate effort without tightening max_tokens on every call.

docker-agent automatically attaches the required task-budgets-2026-03-13 beta header whenever this field is set. You can configure task_budget on any Claude model — docker-agent never gates it by model name. At the time of writing, only Claude Opus 4.7 actually honors the field; other Claude models (Sonnet 4.5, Opus 4.5 / 4.6, etc.) are expected to reject requests that include it. Check the Anthropic release notes linked above for the current list of supported models.

models:
  opus:
    provider: anthropic
    model: claude-opus-4-7
    task_budget: 128000 # integer shorthand → { type: tokens, total: 128000 }
    thinking_budget: adaptive

Object form (forward-compatible with future budget types):

  opus:
    provider: anthropic
    model: claude-opus-4-7
    task_budget:
      type: tokens
      total: 128000

See the full schema on the Model Configuration page.

Server-Side Fallbacks

When the primary model refuses a request (e.g. Claude Fable 5's safety classifiers ending the turn with stop reason refusal), Anthropic can retry the request with backup models in a single round trip. Set fallbacks in provider_opts to a list of model IDs, in priority order:

models:
  fable:
    provider: anthropic
    model: claude-fable-5
    provider_opts:
      fallbacks:
        - claude-opus-4-8
        - claude-sonnet-4-6

docker-agent automatically attaches the required server-side-fallback-2026-06-01 beta header and forwards the option as fallbacks: [{"model": "..."}]. The response's model field reports which model actually served the request.

Fallback models receive the exact same request as the primary model (thinking configuration, task budget, beta features, ...), so list only models that accept the same request shape. Not available on Bedrock, Vertex AI, or the Message Batches API.

Thinking Display

Controls whether thinking blocks are returned in responses when thinking is enabled. Newer Claude models (Opus 4.7+, Fable 5) hide thinking content by default (omitted); docker-agent counters this by requesting summarized thinking whenever an adaptive/effort-based budget is used without an explicit thinking_display, so reasoning stays visible in the UI. Set thinking_display in provider_opts to override:

models:
  claude-opus-4-7:
    provider: anthropic
    model: claude-opus-4-7
    thinking_budget: adaptive
    provider_opts:
      thinking_display: omitted # "summarized", "display", or "omitted"

Valid values:

  • summarized: thinking blocks are returned with summarized thinking text (docker-agent's default for adaptive/effort-based budgets).
  • display: thinking blocks are returned for display.
  • omitted: thinking blocks are returned with an empty thinking field; the signature is still returned for multi-turn continuity. Useful to reduce time-to-first-text-token when streaming.

Note: thinking_display applies to both thinking_budget with token counts and adaptive/effort-based budgets. For token-count budgets no default is applied (the API already defaults to summarized). Full thinking tokens are billed regardless of the thinking_display value.

Note

Anthropic thinking budget values below 1024 or greater than or equal to max_tokens are ignored (a warning is logged).