IDE and tool integrations

Docker Model Runner can serve as a local backend for popular AI coding assistants and development tools. This guide shows how to configure common tools to use models running in DMR.

Prerequisites

Before configuring any tool:

Enable Docker Model Runner in Docker Desktop or Docker Engine.
Enable TCP host access:
- Docker Desktop: Enable host-side TCP support in Settings > AI, or run:
  $ docker desktop enable model-runner --tcp 12434
- Docker Engine: TCP is enabled by default on port 12434.
Pull a model:
$ docker model pull ai/qwen2.5-coder

Cline (VS Code)

Cline is an AI coding assistant for VS Code.

Configuration

Open VS Code and go to the Cline extension settings.
Select OpenAI Compatible as the API provider.
Configure the following settings:

Setting	Value
Base URL	`http://localhost:12434/engines/v1`
API Key	`not-needed` (or any placeholder value)
Model ID	`ai/qwen2.5-coder` (or your preferred model)

Important
The base URL must include /engines/v1 at the end. Do not include a trailing slash.

Troubleshooting Cline

If Cline fails to connect:

Verify DMR is running:
$ docker model status

Test the endpoint directly:

$ curl http://localhost:12434/engines/v1/models

Check that CORS is configured if running a web-based version:
- In Docker Desktop Settings > AI, add your origin to CORS Allowed Origins

Continue (VS Code / JetBrains)

Continue is an open-source AI code assistant that works with VS Code and JetBrains IDEs.

Configuration

Edit your Continue configuration file (~/.continue/config.json):

{
  "models": [
    {
      "title": "Docker Model Runner",
      "provider": "openai",
      "model": "ai/qwen2.5-coder",
      "apiBase": "http://localhost:12434/engines/v1",
      "apiKey": "not-needed"
    }
  ]
}

Using Ollama provider

Continue also supports the Ollama provider, which works with DMR:

{
  "models": [
    {
      "title": "Docker Model Runner (Ollama)",
      "provider": "ollama",
      "model": "ai/qwen2.5-coder",
      "apiBase": "http://localhost:12434"
    }
  ]
}

Cursor

Cursor is an AI-powered code editor.

Configuration

Open Cursor Settings (Cmd/Ctrl + ,).
Navigate to Models > OpenAI API Key.
Configure:
Setting Value
OpenAI API Key not-needed
Override OpenAI Base URL http://localhost:12434/engines/v1
In the model drop-down, enter your model name: ai/qwen2.5-coder

Setting	Value
OpenAI API Key	`not-needed`
Override OpenAI Base URL	`http://localhost:12434/engines/v1`

Note
Some Cursor features may require models with specific capabilities (e.g., function calling). Use capable models like ai/qwen2.5-coder or ai/llama3.2 for best results.

Zed

Zed is a high-performance code editor with AI features.

Configuration

Edit your Zed settings (~/.config/zed/settings.json):

{
  "language_models": {
    "openai": {
      "api_url": "http://localhost:12434/engines/v1",
      "available_models": [
        {
          "name": "ai/qwen2.5-coder",
          "display_name": "Qwen 2.5 Coder (DMR)",
          "max_tokens": 8192
        }
      ]
    }
  }
}

Open WebUI

Open WebUI provides a ChatGPT-like interface for local models.

See Open WebUI integration for detailed setup instructions.

Aider

Aider is an AI pair programming tool for the terminal.

Configuration

Set environment variables or use command-line flags:

export OPENAI_API_BASE=http://localhost:12434/engines/v1
export OPENAI_API_KEY=not-needed

aider --model openai/ai/qwen2.5-coder

Or in a single command:

$ aider --openai-api-base http://localhost:12434/engines/v1 \
        --openai-api-key not-needed \
        --model openai/ai/qwen2.5-coder

LangChain

Python

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="http://localhost:12434/engines/v1",
    api_key="not-needed",
    model="ai/qwen2.5-coder"
)

response = llm.invoke("Write a hello world function in Python")
print(response.content)

JavaScript/TypeScript

import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
  configuration: {
    baseURL: "http://localhost:12434/engines/v1",
  },
  apiKey: "not-needed",
  modelName: "ai/qwen2.5-coder",
});

const response = await model.invoke("Write a hello world function");
console.log(response.content);

LlamaIndex

from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(
    api_base="http://localhost:12434/engines/v1",
    api_key="not-needed",
    model="ai/qwen2.5-coder"
)

response = llm.complete("Write a hello world function")
print(response.text)

OpenCode is an open-source coding assistant designed to integrate directly into developer workflows. It supports multiple model providers and exposes a flexible configuration system that makes it easy to switch between them.

Configuration

Install OpenCode (see docs)

Reference DMR in your OpenCode configuration, either globally at ~/.config/opencode/opencode.json or project specific with a opencode.json file in the root of your project

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "dmr": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Docker Model Runner",
      "options": {
        "baseURL": "http://localhost:12434/v1"
      },
      "models": {
        "ai/qwen2.5-coder": {
          "name": "ai/qwen2.5-coder"
        },
        "ai/llama3.2": {
          "name": "ai/llama3.2"
        }
      }
    }
  }
}

Select the model you want in OpenCode

You can find more details in this Docker Blog post

Common issues

"Connection refused" errors

Ensure Docker Model Runner is enabled and running:
$ docker model status

Verify TCP access is enabled:

$ curl http://localhost:12434/engines/v1/models

Check if another service is using port 12434.
If you run your tool in WSL and want to connect to DMR on the host via localhost, this might not directly work. Configuring WSL to use mirrored networking can solve this.

"Model not found" errors

Verify the model is pulled:
$ docker model list
Use the full model name including namespace (e.g., ai/qwen2.5-coder, not just qwen2.5-coder).

Slow responses or timeouts

For first requests, models need to load into memory. Subsequent requests are faster.

Consider using a smaller model or adjusting the context size:

$ docker model configure --context-size 4096 ai/qwen2.5-coder

Check available system resources (RAM, GPU memory).

CORS errors (web-based tools)

If using browser-based tools, add the origin to CORS allowed origins:

Docker Desktop: Settings > AI > CORS Allowed Origins
Add your tool's URL (e.g., http://localhost:3000)

Recommended models by use case

Use case	Recommended model	Notes
Code completion	`ai/qwen2.5-coder`	Optimized for coding tasks
General assistant	`ai/llama3.2`	Good balance of capabilities
Small/fast	`ai/smollm2`	Low resource usage
Embeddings	`ai/all-minilm`	For RAG and semantic search

What's next

API reference - Full API documentation
Configuration options - Tune model behavior
Open WebUI integration - Set up a web interface

Ask me about Docker

IDE and tool integrations