← Package directory
Available on winget

Install llama-swap

Model swapping for llama.cpp

Install with winget
winget install --id mostlygeek.llama-swap
Upgrade
winget upgrade --id mostlygeek.llama-swap
Uninstall
winget uninstall --id mostlygeek.llama-swap

About llama-swap

llama-swap is a light weight, transparent proxy server that provides automatic model swapping to llama.cpp's server.

What's new in 216

Changelog - 2982dd3 ui-svelte: update link to performance discussion thread

Read release notes

Version history

Version Updated Notes
216 Unknown Changelog - 2982dd3 ui-svelte: update link to performance discussion thread
214 Unknown Changelog - b2fcc2d ui-svelte: fix cached tokens total counting -1 sentinel (#760) - 6a9c4ef fix: use --loop instead of -loop for nvidia-smi (driver 540+ compat) (#759)
213 Unknown Changelog - 0c813e4 ui-svelte: package updates - fe71e8a proxy,ui-svelte: improve support for v1/messages and v1/responses (#758)
211 Unknown Changelog - c79114d proxy: fix logger not checking matrix for processes
210 Unknown Changelog - 430166d proxy: fix zero duration for non streaming responses (#723) - 5b4beac fix: ?no-history flag and improve /logs monitoring docs (#721)
209 Unknown Changelog - fd3c28f Refactor Activity Page (#710) - a846c4f config: remove hard cap on macro length (#718) - 5bae33a ui-svelte: default theme to user preferred color scheme (#712) - 8f4ff01 ui-svelte: make it easier to t...
208 Unknown Changelog - e8d4384 ui-svelte: support reasoning and reasoning_content (#708) - ce28485 ui-svelte: add prompt processing histogram (#705) - 3cd7837 fix: support architecture-specific download URLs in install script (#698...
207 Unknown Changelog - 0b31cca ui-svelte: fix histogram calculation (#695) - 5938dbe Push unified docker images on scheduled runs (#694)
205 Unknown Changelog - 66639e8 proxy: replace fsnotify with stat-poll watcher and add SIGHUP reload (#685) - 625b296 docker/unified: add uv via pip install (#681)
204 Unknown Changelog - 231e622 proxy: fix matrix race and process stop bug (#677) - 57ac666 .github/workflows: tweak push ghcr conditional (#676) - 6972830 .github/workflows: add toggle for pushing unified images to github (#672) -...
203 Unknown Changelog - 5e3c646 proxy: compress captures with zstd (#668) - c3f0d43 proxy: fix race conditions during swap (#667) - f6cf9f5 proxy: Refactor tests (#660) - 121fd93 Makefile: restore linux arm64 targets
202 Unknown Changelog - 17233e9 docs: update configuration.md for matrix - 4866d16 README.md: update to use matrix instead of groups - 35193f8 proxy: add swap matrix with solver-based model swapping (#646) - 40e39f7 ui-svelte: fix s...
201 Unknown Changelog - a9d840f proxy,proxy/config: restore timeouts to pre PR 619 (#648) - 7b2b827 docker/unified: derive rootless image from root container (#644)
199 Unknown Changelog - 8fabc75 docker/unified: vulkan build fixes (#600) - e5e7391 .github,docker/unified: include vulkan build (#599) - 2c282dc .github,docker/unified: improve caching and fix bugs (#598) - 916d13f .github/workflow...
198 Unknown Changelog - c3c258a proxy: fix metrics capture for v1/responses (#586) - 29a38fd ui-svelte: upgrade to vite 8 (#585) - d569681 Change model sorting to natural order (#582) - 24efdb7 config: add macro support for name and...
197 Unknown Changelog - cc77139 proxy,proxy/config: add global TTL feature (#554) - 390a35b ui-svelte: add copy button to markdown code blocks (#537) - 181f71c .github,docker: add cuda13 architecture support (#551)
196 Unknown Changelog - 49546e2 ui: fix text size svg - 2c07896 Update README with additional images - 175bb36 Revise README description for clarity and detail - aedb640 Enhance web UI section in README - 2f377f6 ui: add OGG audio f...
195 Unknown Changelog - 64e4c79 ui: add Rerank tab to playground (#536)
194 Unknown Changelog - 19fb5f3 proxy: implement setParamsByID filter (#535)
193 Unknown Changelog - b45102b ui: smart auto-scroll in LogPanel (#530)