IDE and tool integrations
Docker Model Runner can serve as a local backend for popular AI coding assistants and development tools. This guide shows how to configure common tools to use models running in DMR.
Prerequisites
Before configuring any tool:
- Enable Docker Model Runner in Docker Desktop or Docker Engine.
- Enable TCP host access:
- Docker Desktop: Enable host-side TCP support in Settings > AI, or run:
$ docker desktop enable model-runner --tcp 12434 - Docker Engine: TCP is enabled by default on port 12434.
- Docker Desktop: Enable host-side TCP support in Settings > AI, or run:
- Pull a model:
$ docker model pull ai/qwen2.5-coder
Cline (VS Code)
Cline is an AI coding assistant for VS Code.
Configuration
- Open VS Code and go to the Cline extension settings.
- Select OpenAI Compatible as the API provider.
- Configure the following settings:
| Setting | Value |
|---|---|
| Base URL | http://localhost:12434/engines/v1 |
| API Key | not-needed (or any placeholder value) |
| Model ID | ai/qwen2.5-coder (or your preferred model) |
ImportantThe base URL must include
/engines/v1at the end. Do not include a trailing slash.
Troubleshooting Cline
If Cline fails to connect:
Verify DMR is running:
$ docker model statusTest the endpoint directly:
$ curl http://localhost:12434/engines/v1/modelsCheck that CORS is configured if running a web-based version:
- In Docker Desktop Settings > AI, add your origin to CORS Allowed Origins
Continue (VS Code / JetBrains)
Continue is an open-source AI code assistant that works with VS Code and JetBrains IDEs.
Configuration
Edit your Continue configuration file (~/.continue/config.json):
{
"models": [
{
"title": "Docker Model Runner",
"provider": "openai",
"model": "ai/qwen2.5-coder",
"apiBase": "http://localhost:12434/engines/v1",
"apiKey": "not-needed"
}
]
}Using Ollama provider
Continue also supports the Ollama provider, which works with DMR:
{
"models": [
{
"title": "Docker Model Runner (Ollama)",
"provider": "ollama",
"model": "ai/qwen2.5-coder",
"apiBase": "http://localhost:12434"
}
]
}Cursor
Cursor is an AI-powered code editor.
Configuration
Open Cursor Settings (Cmd/Ctrl + ,).
Navigate to Models > OpenAI API Key.
Configure:
Setting Value OpenAI API Key not-neededOverride OpenAI Base URL http://localhost:12434/engines/v1In the model drop-down, enter your model name:
ai/qwen2.5-coder
NoteSome Cursor features may require models with specific capabilities (e.g., function calling). Use capable models like
ai/qwen2.5-coderorai/llama3.2for best results.
Zed
Zed is a high-performance code editor with AI features.
Configuration
Edit your Zed settings (~/.config/zed/settings.json):
{
"language_models": {
"openai": {
"api_url": "http://localhost:12434/engines/v1",
"available_models": [
{
"name": "ai/qwen2.5-coder",
"display_name": "Qwen 2.5 Coder (DMR)",
"max_tokens": 8192
}
]
}
}
}Open WebUI
Open WebUI provides a ChatGPT-like interface for local models.
See Open WebUI integration for detailed setup instructions.
Aider
Aider is an AI pair programming tool for the terminal.
Configuration
Set environment variables or use command-line flags:
export OPENAI_API_BASE=http://localhost:12434/engines/v1
export OPENAI_API_KEY=not-needed
aider --model openai/ai/qwen2.5-coderOr in a single command:
$ aider --openai-api-base http://localhost:12434/engines/v1 \
--openai-api-key not-needed \
--model openai/ai/qwen2.5-coder
LangChain
Python
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="http://localhost:12434/engines/v1",
api_key="not-needed",
model="ai/qwen2.5-coder"
)
response = llm.invoke("Write a hello world function in Python")
print(response.content)JavaScript/TypeScript
import { ChatOpenAI } from "@langchain/openai";
const model = new ChatOpenAI({
configuration: {
baseURL: "http://localhost:12434/engines/v1",
},
apiKey: "not-needed",
modelName: "ai/qwen2.5-coder",
});
const response = await model.invoke("Write a hello world function");
console.log(response.content);LlamaIndex
from llama_index.llms.openai_like import OpenAILike
llm = OpenAILike(
api_base="http://localhost:12434/engines/v1",
api_key="not-needed",
model="ai/qwen2.5-coder"
)
response = llm.complete("Write a hello world function")
print(response.text)Common issues
"Connection refused" errors
Ensure Docker Model Runner is enabled and running:
$ docker model statusVerify TCP access is enabled:
$ curl http://localhost:12434/engines/v1/modelsCheck if another service is using port 12434.
"Model not found" errors
Verify the model is pulled:
$ docker model listUse the full model name including namespace (e.g.,
ai/qwen2.5-coder, not justqwen2.5-coder).
Slow responses or timeouts
For first requests, models need to load into memory. Subsequent requests are faster.
Consider using a smaller model or adjusting the context size:
$ docker model configure --context-size 4096 ai/qwen2.5-coderCheck available system resources (RAM, GPU memory).
CORS errors (web-based tools)
If using browser-based tools, add the origin to CORS allowed origins:
- Docker Desktop: Settings > AI > CORS Allowed Origins
- Add your tool's URL (e.g.,
http://localhost:3000)
Recommended models by use case
| Use case | Recommended model | Notes |
|---|---|---|
| Code completion | ai/qwen2.5-coder | Optimized for coding tasks |
| General assistant | ai/llama3.2 | Good balance of capabilities |
| Small/fast | ai/smollm2 | Low resource usage |
| Embeddings | ai/all-minilm | For RAG and semantic search |
What's next
- API reference - Full API documentation
- Configuration options - Tune model behavior
- Open WebUI integration - Set up a web interface