OpenAI
Use GPT-4o, GPT-5, GPT-5-mini, and other OpenAI models with docker-agent.
Setup
# Set your API key
export OPENAI_API_KEY="sk-..."Configuration
Inline
agents:
root:
model: openai/gpt-5Named Model
models:
gpt:
provider: openai
model: gpt-5
temperature: 0.7
max_tokens: 4000Available Models
| Model | Best For |
|---|---|
gpt-5 | Most capable, complex reasoning |
gpt-5-mini | Fast, cost-effective, good reasoning |
gpt-4o | Multimodal, balanced performance |
gpt-4o-mini | Cheapest, fast for simple tasks |
Find more model names at modelnames.ai or in the official OpenAI docs.
Thinking Budget
OpenAI reasoning models (o-series, gpt-5, gpt-5-mini) support extended thinking through the reasoning_effort API parameter. Set thinking_budget to control the effort level:
models:
gpt-thinker:
provider: openai
model: gpt-5-mini
thinking_budget: high # minimal | low | medium | high | xhighEffort levels:
| Level | Description |
|---|---|
none | Don't request extra reasoning (alias for 0); the API's own default still applies. |
minimal | Fastest; lightest reasoning pass. |
low | Quick reasoning for straightforward tasks. |
medium | Balanced default. |
high | More thorough; recommended for complex tasks. |
xhigh | Near-maximum effort; slower but most accurate. |
These are the only values OpenAI accepts — token counts, max, adaptive, and adaptive/<effort> are rejected with a configuration error at request time. Older models (o1, o3-mini) only accept low/medium/high.
WarningHidden reasoning tokens
OpenAI reasoning models always produce hidden reasoning tokens that count against
max_tokens— even withthinking_budget: none. docker-agent automatically raises the output-token floor for its internal low-effort calls so reasoning cannot starve visible text output.
See the Thinking / Reasoning guide for a cross-provider overview.
TipCustom endpoints
Use
base_urlfor proxies and OpenAI-compatible services. See Custom Providers for full setup.
Custom Endpoint
Use base_url to connect to OpenAI-compatible APIs:
models:
custom:
provider: openai
model: gpt-5-mini
base_url: https://your-proxy.example.com/v1WebSocket Transport
For OpenAI Responses API models (gpt-4.1+, o-series, gpt-5), you can use WebSocket streaming instead of the default SSE (Server-Sent Events):
models:
fast-gpt:
provider: openai
model: gpt-4.1
provider_opts:
transport: websocket # Use WebSocket instead of SSEBenefits
- ~40% faster for workflows with 20+ tool calls
- Persistent connection reduces per-turn overhead
- Server-side caching of connection state
- Automatic fallback to SSE if WebSocket fails
Requirements
- Only works with Responses API models:
gpt-4.1+,o1,o3,o4,gpt-5 - NOT compatible with the
--models-gatewayflag (automatically falls back to SSE when a gateway is configured) - Requires
OPENAI_API_KEYenvironment variable
Example
See examples/websocket_transport.yaml for a complete example.