← Package directory
Available on winget

Install llamafile

Distribute and run LLMs with a single file.

Install with winget
winget install --id Mozilla.llamafile
Upgrade
winget upgrade --id Mozilla.llamafile
Uninstall
winget uninstall --id Mozilla.llamafile

About llamafile

llamafile lets you distribute and run LLMs with a single file. Our goal is to make open LLMs much more accessible to both developers and end users. We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation.

What's new in 0.10.1

What's Changed Summary: - Added support for vulkan dylibs - Added windows build scripts → we now have cuda, rocm, vulkan both as .so and as .dll libraries - Updated llama.cpp submodule to 5e9c63546 → we now have llama.cpp support for new models (e.g. gemma-4, bonsai, Qwen3.6) and functionalities (e.g. llama.cpp internal tools for agents) Details: - Fix README_0.10.0 by @aittalam in https://github.com/mozilla-ai/llamafile/pull/918 - Add support for vulkan dylibs by @aittalam in https://github.com/mozilla-ai/llamafile/pull/892 - Add tinyblasStrsmBatched kernel by @aittalam in https://github.com/mozilla-ai/llamafile/pull/923 - Fix: GGUF Q5_1 quant crashes llamafile on aarch64 cpu by @aittalam in https://github.com/mozilla-ai/llamafile/pull/928 - fix broken link 'feel free to choose' → example llamafiles by @bquast in https://github.com/mozilla-ai/llamafile/pull/927 - Fix block-size assumption by @aittalam in https://github.com/mozilla-ai/llamafile/pull/935 - Add windows build scripts for CUDA by @aittalam in https://github.com/mozilla-ai/llamafile/pull/924 - Fix whisperfile documentation link in README by @martin0258 in https://github.com/mozilla-ai/llamafile/pull/939 - Update llama.cpp submodule to 5e9c63546 by @aittalam in https://github.com/mozilla-ai/llamafile/pull/941 - Migrate docs from MkDocs/GitHub Pages to GitBook by @angpt in https://github.com/mozilla-ai/llamafile/pull/946 - Win build improvements by @aittalam in https://github.com/mozilla-ai/llamafile/pull/940 - Fix cuda: /lib/x86_64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.32' not found` by @aittalam in https...

Read release notes

Version history

Version Updated Notes
0.10.1 Unknown What's Changed Summary: - Added support for vulkan dylibs - Added windows build scripts → we now have cuda, rocm, vulkan both as .so and as .dll libraries - Updated llama.cpp submodule to 5e9c63546 → we now have llama.cp...
0.10.0 Unknown llamafile versions starting from 0.10.0 use a new build system, aimed at keeping our code more easily aligned with the latest versions of llama.cpp. This means they support more recent models and functionalities, but at...
0.9.3 Unknown Release notes
0.9.2 Unknown Release notes
0.9.1 Unknown Release notes
0.9.0 Unknown Release notes
0.8.17 Unknown Release notes
0.8.16 Unknown Release notes
0.8.15 Unknown Release notes
0.8.14 Unknown This release introduces our new CLI chatbot interface. It supports multi-line input using triple quotes. It will syntax highlight Python, C, C++, Java, and JavaScript code. This chatbot is now the default mode of operati...