Using Aider for connecting to llama.cpp

09 October 2025
Local LLM

Aider was installed on my laptop using the aider-install script. This post and probably other early posts will be draft notes instead of complete docs. The aider-install scripts uses uv behind the scene to handle python installs. There are other ways to install that dove-tail with existing tools (uv, mise) or python installs.

llama.cpp server

Start the llama.cpp server with the Qwen3-Coder model. Script used at time of writing:

 llama.cpp/build/bin/llama-server \
  --model ./Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf \
  --threads -1 \
  --ctx-size 32768 \
  -ot ".*expert.*=CPU"  \
  -ngl 20 \
  --temp 0.7 \
  --min-p 0.0 \
  --top-p 0.8 \
  --top-k 20 \
  --repeat-penalty 1.05 \
  --jinja \
  --port 8090 \
  --host 0.0.0.0 \
  --log-verbosity 2
#  --flash-attn \
#  --cache-type-k q8_0 \
#  --cache-type-v q8_0

Aider

Start aider in the appropriate directory. Script used at time or writing:

aider \
--openai-api-base http://192.168.1.31:8090 \
--openai-api-key NONE \
--model openai/Qwen3-Coder \
--model-metadata-file aider-config.json \
--no-auto-commits \
--map-tokens 1024

Here is the current aider-config.json. I haven't tuned this so probably off in places.

{
    "Qwen3-Coder": {
        "max_tokens": 60000,
        "max_input_tokens": 60000,
        "max_output_tokens": 60000,
        "input_cost_per_token": 0.00000014,
        "output_cost_per_token": 0.00000028,
        "litellm_provider": "openai",
        "mode": "chat"
    }
}

Mise

I'm using mise to manage tools and tasks. My ~/.config/mise/config.toml is below. The allows me to cd to the right place and do 'mise run adr'.

[tools]
node = "latest"
"npm:@anthropic-ai/claude-code" = "latest"
python = "3.12"
uv = "latest"

[tasks]
bundle='repomix --ignore "**/*.sql,**/*.md,**/*.json"'
adr='~/dev/chap/bin/aider.sh'

← Previous
Local LLM Setup with CUDA, llama.cpp, and Qwen
Next →
Creating a Docker image for running CUDA and llama.cpp