https://github.com/bjodah/local-aider

Proof-of-concept Aider w. local (24GB vram) QwQ+Qwen2.5-Coder using litellm-proxy / llama-swap / llama.cpp

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary

Last synced: 6 months ago · JSON representation

Repository

Proof-of-concept Aider w. local (24GB vram) QwQ+Qwen2.5-Coder using litellm-proxy / llama-swap / llama.cpp

Basic Info

Host: GitHub
Owner: bjodah
Language: Shell
Default Branch: main
Size: 3.83 MB

Statistics

Stars: 9
Watchers: 1
Forks: 1
Open Issues: 2
Releases: 0

Archived

Created 12 months ago · Last pushed 11 months ago

Metadata Files

Readme

local-aider

UPDATE 2025-04-07: Please take a look at llama-swap's example for a better approach to this. I have since changed my approach by not relying on litellm_proxy/ prefix, but instead highjacking the openai-base url (with the minor inconvenience that we need to restart aider if we actually want to connect to an openai model). My current setup using llama-swap is found here: https://github.com/bjodah/llm-multi-backend-container Original README follows:

This repo is just an attempt to collect scripts and notes on how one can enable using aider with a local reasoning "architect" model (Qwen/QwQ-32B) and a non-reasoning "editor" model (Qwen/Qwen2.5-Coder-Instruct-32B) on a single consumer-grade GPU (tested on RTX 3090).

I should mention that the simplest solution is probably to use the Ollama support in aider, the approach here however, allows you (in principle) to experiment with different backends (such as vLLM, ExllamaV2+tabbyAPI, ...).

Usage

console $ mkdir brainstorming-repo $ cd brainstorming-repo $ git init . $ ./bin/local-model-enablement-wrapper \ aider \ --architect --model litellm_proxy/local-qwq-32b \ --editor-model litellm_proxy/local-qwen25-coder-32b

A less "magical" approach would be to launch the compose file manually.

In one terminal: console $ podman compose up [pod-llama-cpp-swap] | llama-swap listening on :8686 [pod-litellm-proxy] | INFO: Started server process [1] [pod-litellm-proxy] | INFO: Waiting for application startup. [pod-litellm-proxy] | INFO: Application startup complete. [pod-litellm-proxy] | INFO: Uvicorn running on http://0.0.0.0:4000 (Press CTRL+C to quit) ... and then in another terminal, launch aider as usual, but make sure you export the relevant environment variables: console $ env \ LITELLM_PROXY_API_BASE="http://localhost:4000" \ LITELLM_PROXY_API_KEY=sk-deadbeef0badcafe \ aider \ --architect --model litellm_proxy/local-qwq-32b \ --editor-model litellm_proxy/local-qwen25-coder-32b

Customization

Everything in this repo is probably subject to customization. If you want to increase the verbosity of the logging (e.g. trouble-shooting) you can adjust these settings: console grep -E '(logRequests|detailed_debug)' -R . ./compose.yml: host-litellm.py --config /root/litellm.yml --detailed_debug ./config-llamacpp-container.yaml:logRequests: true

Challenges

The 32B parameter models fit in 24GB VRAM, but only one at a time, solution: llama-swap
The easiest way to run the models is using llama.cpp's Docker image. But llama-swap has a problem stopping the container when unloading a model, solution: run llama-swap inside llama.cpp's server container.
aider relies on litellm for routing model selection to different backends. litellm relies on 'openai/' prefix to indicate OpenAI compatible API endpoint. And while litellm offers custom prompts as well as taking prompts from huggingface config files, there are two problems: the former does not have an effect when the prefix is openai/ (I submitted a PR to address this here) the latter requires a huggingface/ prefix which unfortunately changes the request format to that of huggingface's API which is not OpenAI compatible (as far as I can tell). Workaround: I use a patched litellm (from the PR) for now.

TODOs

[ ] the prompt template might not be working quite right, looking at the logs, and responses, \n\n might not be correctly escaped, I see occurrences of "nn" and "nnnn"
[ ] litellm proxy does not seem to propagate request interruption

Demo

or view the full cast using asciinema player here.

Miscellaneous

Aider needs to be informed about context window size, you may copy/append .aider.model.metadata.json to your $HOME directory (or the root of your git repo in which you intend to run aider).
The health-check query, in the wrapper-script gives some delay when launching the script, set LOCALAIDERSKIPHEALTHCHECK=1 to skip it.
Best practice is to run aider in a sandboxed environment (executing LLM generated code is risky). We can replace the aider call with e.g "podman run ..." or "docker run ...". At this point, an alias might come in handy: console $ grep aider-local-qwq32 ~/.bashrc alias aider-local-qwq32="env LOCAL_AIDER_SKIP_HEALTH_CHECK=1 local-model-enablement-wrapper contaider --architect --model litellm_proxy/local-qwq-32b --editor-model litellm_proxy/local-qwen25-coder-32b" this alias uses a utility script to launch aider in a container (contaider).

Owner

Name: Bjorn
Login: bjodah
Kind: user

Repositories: 48
Profile: https://github.com/bjodah

GitHub Events

Total

Watch event: 9
Delete event: 1
Push event: 12
Pull request review comment event: 1
Pull request review event: 1
Pull request event: 5
Create event: 4

Last Year

Watch event: 9
Delete event: 1
Push event: 12
Pull request review comment event: 1
Pull request review event: 1
Pull request event: 5
Create event: 4

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/bjodah/local-aider

Science Score: 13.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

local-aider

Usage

Customization

Challenges

TODOs

Demo

Miscellaneous

Owner

GitHub Events

Total

Last Year