← Package directory
Available on winget

Install llama-swap

Model swapping for llama.cpp

Install with winget
winget install --id mostlygeek.llama-swap
Upgrade
winget upgrade --id mostlygeek.llama-swap
Uninstall
winget uninstall --id mostlygeek.llama-swap

About llama-swap

llama-swap is a light weight, transparent proxy server that provides automatic model swapping to llama.cpp's server.

What's new in 235

Changelog - c59816b internal/server: show inflight activity requests (#895) - d8ed077 Add llama-tts binary (#894) - 0d2d851 Makefile: improve build version stamping (#891)

Read release notes

Version history

Version Updated Notes
235 Changelog - c59816b internal/server: show inflight activity requests (#895) - d8ed077 Add llama-tts binary (#894) - 0d2d851 Makefile: improve build version stamping (#891)
234 Unknown Changelog - 4a6b8a8 internal/router: reject concurrency excess before streaming (#889) - 3023ab4 add /props to modelGetRoutes (#886)
233 Unknown Bug and Performance improvements for the UI - fixed streaming logs blocking llama-server inference. Logs are pushed to connected UI clients via an async channel now so if they are slow it does not affect reading from std...
232 Unknown Changelog - 25b27cb ui-svelte: add rounded borders - 18cc854 update hero image
230 Unknown This release adds the -config-dir flag! Using a config-dir it’s now possible to split huge config.yaml files into fragments that get automatically merged together. You can have one model per file and combined with -watch...
229 Unknown Changelog - 316ad63 config,server: add upstream.ignorePaths (#869) - e37077a feat: hide performance menu item if disabled (#832) - eff9b60 server: capture failed (non-200) LLM requests (#862) - 9bcddad internal/server,ui...
228 Unknown Changelog - a15e479 proxy: meter /upstream requests via metrics middleware (#858)
226 Unknown Changelog - ed77385 ui: improve manual model load and cancel (#847)
224 Unknown Changelog - 0cfe5a6 Makefile,internal: fix websocket regression and other small things (#830) - 44e1501 internal/process,server: fix unload regression (#828) - 46cea36 proxy: remove legacy code. Thanks champ 🫡 (#822) - c...
223 Unknown Changelog - 29d3d9b perf: add macOS GPU monitoring via mactop and ioreg (#816)
222 Unknown Changelog - 9be9a87 internal/process: improve windows shutdown behaviour (#808)
219 Unknown Changelog - 4ca9c47 Makefile,internal/server: various release tweaks - 146a9ea ui-svelte: update build directory (#801)
218 Unknown Changelog - 02e015f Introduce new routing backend (#790) - 63bc266 Add new power draw column header for rocm-smi monitoring (#788)
217 Unknown Changelog - 636b53e Improve rocm-smi performance monitoring (#775) - 59cd3b6 Added Windows performance monitoring using nvidia-smi (#773) - 5d1e62d Disable auto review feature in coderabbit config - dbb869d Increase inac...
216 Unknown Changelog - 2982dd3 ui-svelte: update link to performance discussion thread
214 Unknown Changelog - b2fcc2d ui-svelte: fix cached tokens total counting -1 sentinel (#760) - 6a9c4ef fix: use --loop instead of -loop for nvidia-smi (driver 540+ compat) (#759)
213 Unknown Changelog - 0c813e4 ui-svelte: package updates - fe71e8a proxy,ui-svelte: improve support for v1/messages and v1/responses (#758)
211 Unknown Changelog - c79114d proxy: fix logger not checking matrix for processes
210 Unknown Changelog - 430166d proxy: fix zero duration for non streaming responses (#723) - 5b4beac fix: ?no-history flag and improve /logs monitoring docs (#721)
209 Unknown Changelog - fd3c28f Refactor Activity Page (#710) - a846c4f config: remove hard cap on macro length (#718) - 5bae33a ui-svelte: default theme to user preferred color scheme (#712) - 8f4ff01 ui-svelte: make it easier to t...