Share feedback
Answers are generated based on the documentation.

AWS Bedrock

Access Claude, Nova, Llama, and more through AWS infrastructure with enterprise-grade security and compliance.

Prerequisites

  • AWS account with Bedrock enabled in your region
  • Model access granted in the Bedrock Console (some models require approval)
  • AWS credentials configured (see authentication below)

Configuration

models:
  bedrock-claude:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    max_tokens: 64000
    provider_opts:
      region: us-east-1

Authentication

Option 1: Bedrock API Key (Simplest)

export AWS_BEARER_TOKEN_BEDROCK="your-key"
models:
  bedrock:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    token_key: AWS_BEARER_TOKEN_BEDROCK # env var name
    provider_opts:
      region: us-east-1

Option 2: AWS Credentials (Default)

Uses the standard AWS SDK credential chain: env vars → shared credentials → config → IAM roles.

models:
  bedrock:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    provider_opts:
      profile: my-aws-profile
      region: us-east-1

With IAM Role Assumption

models:
  bedrock:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    provider_opts:
      role_arn: "arn:aws:iam::123456789012:role/BedrockAccessRole"
      external_id: "my-external-id"

Provider Options

OptionTypeDefaultDescription
regionstringus-east-1AWS region
profilestringAWS profile name
role_arnstringIAM role ARN for assume role
role_session_namestringdocker-agent-bedrock-sessionSession name for assumed role
external_idstringExternal ID for role assumption
endpoint_urlstringCustom endpoint (VPC/testing)
interleaved_thinkingboolautoAllow reasoning between tool calls (Claude); auto-enabled when a thinking budget is set on a Claude model; adds the required beta header automatically
disable_prompt_cachingboolfalseDisable automatic prompt caching

Inference Profiles

Use inference profile prefixes for optimal routing:

PrefixRoutes To
global.All commercial AWS regions (recommended)
us.US regions only
eu.EU regions only (GDPR compliance)
apac.Asia Pacific regions only
Tip

Inference profiles

Use global. prefix on model IDs for automatic cross-region routing. Use eu. prefix for GDPR compliance.

Thinking Budget (Claude on Bedrock)

Bedrock Claude models support extended thinking — an internal reasoning phase before the model produces its response. Set thinking_budget to a token count (1024–32768) or an effort level string that maps automatically:

Effort levelToken budget
minimal1,024
low2,048
medium8,192
high16,384
xhigh/max32,768
models:
  bedrock-claude-thinking:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    thinking_budget: 8192   # tokens, or use an effort string like "medium"
    max_tokens: 16384       # must be > thinking_budget
    provider_opts:
      region: us-east-1

thinking_budget must be ≥ 1024 and less than max_tokens. Values outside this range are logged as a warning and ignored.

Adaptive thinking (Opus 4.6+)

Newer Claude Opus models (4.6, 4.7, 4.8) reject token-based thinking — Bedrock returns a ValidationException asking for thinking.type=adaptive. For these models, use adaptive thinking:

models:
  bedrock-opus-adaptive:
    provider: amazon-bedrock
    model: global.anthropic.claude-opus-4-8
    thinking_budget: adaptive/high   # adaptive | adaptive/low | adaptive/medium | adaptive/high | adaptive/xhigh | adaptive/max
    provider_opts:
      region: us-east-1

docker-agent recognizes these models (including Bedrock-style IDs) and transparently coerces a configured token budget or effort level to adaptive thinking, logging a warning — so thinking_budget: 32768 on Opus 4.8 won't fail, but adaptive or adaptive/<effort> is the recommended configuration. On older models that still use token-based thinking (e.g. Sonnet 4.5), adaptive is forwarded as-is and rejected by Bedrock — use a token count or effort level there instead.

Note

Temperature and top_p

Bedrock Claude suppresses temperature and top_p while extended thinking is active — Anthropic requires temperature=1.0 internally.

Interleaved Thinking (Claude on Bedrock)

Interleaved thinking lets the model reason between tool calls, not just at the start. This is useful for complex agentic tasks. Enable it alongside a thinking budget:

models:
  bedrock-claude-interleaved:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    thinking_budget: high
    provider_opts:
      region: us-east-1
      # interleaved_thinking is auto-enabled when thinking_budget is set

docker-agent auto-enables interleaved_thinking whenever a thinking budget is configured on a Bedrock-hosted Claude model and automatically adds the interleaved-thinking-2025-05-14 beta header. If you set interleaved_thinking: false while a thinking budget is active, a warning is logged because the budget may be ignored by Bedrock without the beta header.

See the Thinking / Reasoning guide for a full cross-provider overview.

Prompt Caching

Automatically enabled for supported models to reduce latency and costs. System prompts, tool definitions, and recent messages are cached with a 5-minute TTL.

# Disable if needed
provider_opts:
  disable_prompt_caching: true