AWS Bedrock

Table of contents

Access Claude, Nova, Llama, and more through AWS infrastructure with enterprise-grade security and compliance.

Prerequisites

AWS account with Bedrock enabled in your region
Model access granted in the Bedrock Console (some models require approval)
AWS credentials configured (see authentication below)

Configuration

models:
  bedrock-claude:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    max_tokens: 64000
    provider_opts:
      region: us-east-1

Authentication

Option 1: Bedrock API Key (Simplest)

export AWS_BEARER_TOKEN_BEDROCK="your-key"

models:
  bedrock:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    token_key: AWS_BEARER_TOKEN_BEDROCK # env var name
    provider_opts:
      region: us-east-1

Option 2: AWS Credentials (Default)

Uses the standard AWS SDK credential chain: env vars → shared credentials → config → IAM roles.

models:
  bedrock:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    provider_opts:
      profile: my-aws-profile
      region: us-east-1

With IAM Role Assumption

models:
  bedrock:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    provider_opts:
      role_arn: "arn:aws:iam::123456789012:role/BedrockAccessRole"
      external_id: "my-external-id"

Provider Options

Option	Type	Default	Description
`region`	string	us-east-1	AWS region
`profile`	string	—	AWS profile name
`role_arn`	string	—	IAM role ARN for assume role
`role_session_name`	string	docker-agent-bedrock-session	Session name for assumed role
`external_id`	string	—	External ID for role assumption
`endpoint_url`	string	—	Custom endpoint (VPC/testing)
`interleaved_thinking`	bool	auto	Allow reasoning between tool calls (Claude); auto-enabled when a thinking budget is set on a Claude model; adds the required beta header automatically
`disable_prompt_caching`	bool	false	Disable automatic prompt caching

Inference Profiles

Use inference profile prefixes for optimal routing:

Prefix	Routes To
`global.`	All commercial AWS regions (recommended)
`us.`	US regions only
`eu.`	EU regions only (GDPR compliance)
`apac.`	Asia Pacific regions only

Tip
Inference profiles
Use global. prefix on model IDs for automatic cross-region routing. Use eu. prefix for GDPR compliance.

Thinking Budget (Claude on Bedrock)

Bedrock Claude models support extended thinking — an internal reasoning phase before the model produces its response. Set thinking_budget to a token count (1024–32768) or an effort level string that maps automatically:

Effort level	Token budget
`minimal`	1,024
`low`	2,048
`medium`	8,192
`high`	16,384
`xhigh`/`max`	32,768

models:
  bedrock-claude-thinking:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    thinking_budget: 8192   # tokens, or use an effort string like "medium"
    max_tokens: 16384       # must be > thinking_budget
    provider_opts:
      region: us-east-1

thinking_budget must be ≥ 1024 and less than max_tokens. Values outside this range are logged as a warning and ignored.

Adaptive thinking (Opus 4.6+)

Newer Claude Opus models (4.6, 4.7, 4.8) reject token-based thinking — Bedrock returns a ValidationException asking for thinking.type=adaptive. For these models, use adaptive thinking:

models:
  bedrock-opus-adaptive:
    provider: amazon-bedrock
    model: global.anthropic.claude-opus-4-8
    thinking_budget: adaptive/high   # adaptive | adaptive/low | adaptive/medium | adaptive/high | adaptive/xhigh | adaptive/max
    provider_opts:
      region: us-east-1

docker-agent recognizes these models (including Bedrock-style IDs) and transparently coerces a configured token budget or effort level to adaptive thinking, logging a warning — so thinking_budget: 32768 on Opus 4.8 won't fail, but adaptive or adaptive/<effort> is the recommended configuration. On older models that still use token-based thinking (e.g. Sonnet 4.5), adaptive is forwarded as-is and rejected by Bedrock — use a token count or effort level there instead.

Note
Temperature and top_p
Bedrock Claude suppresses temperature and top_p while extended thinking is active — Anthropic requires temperature=1.0 internally.

Interleaved Thinking (Claude on Bedrock)

Interleaved thinking lets the model reason between tool calls, not just at the start. This is useful for complex agentic tasks. Enable it alongside a thinking budget:

models:
  bedrock-claude-interleaved:
    provider: amazon-bedrock
    model: global.anthropic.claude-sonnet-4-5-20250929-v1:0
    thinking_budget: high
    provider_opts:
      region: us-east-1
      # interleaved_thinking is auto-enabled when thinking_budget is set

docker-agent auto-enables interleaved_thinking whenever a thinking budget is configured on a Bedrock-hosted Claude model and automatically adds the interleaved-thinking-2025-05-14 beta header. If you set interleaved_thinking: false while a thinking budget is active, a warning is logged because the budget may be ignored by Bedrock without the beta header.

See the Thinking / Reasoning guide for a full cross-provider overview.

Prompt Caching

Automatically enabled for supported models to reduce latency and costs. System prompts, tool definitions, and recent messages are cached with a 5-minute TTL.

# Disable if needed
provider_opts:
  disable_prompt_caching: true

What can I help you with?

AWS Bedrock