← Package directory
Available on winget

Install llama.cpp

LLM inference in C/C++

Install with winget
winget install --id ggml.llamacpp
Upgrade
winget upgrade --id ggml.llamacpp
Uninstall
winget uninstall --id ggml.llamacpp

About llama.cpp

LLM inference in C/C++

What's new in b9222

hexagon: add support for TRI op (#22822) - Hexagon: TRI HVX Kernel addition to ggml hexagon HTP ops and context - addressed PR review comments for TRI op - hexagon: clang format - hex-unary: remove merge conflict markers - hex-ggml: remove duplicate op cases (merge conflict) - hex-ggml: fix editor config errors Co-authored-by: Todor Boinovski todorb@qti.qualcomm.com Co-authored-by: Max Krasnyansky maxk@qti.qualcomm.com macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS XCFramework Linux: - Ubuntu x64 (CPU) - Ubuntu arm64 (CPU) - Ubuntu s390x (CPU) - Ubuntu x64 (Vulkan) - Ubuntu arm64 (Vulkan) - Ubuntu x64 (ROCm 7.2) - Ubuntu x64 (OpenVINO) - Ubuntu x64 (SYCL FP32) - Ubuntu x64 (SYCL FP16) Android: - Android arm64 (CPU) Windows: - Windows x64 (CPU) - Windows arm64 (CPU) - Windows x64 (CUDA 12) - CUDA 12.4 DLLs - Windows x64 (CUDA 13) - CUDA 13.1 DLLs - Windows x64 (Vulkan) - Windows x64 (SYCL) - Windows x64 (HIP) openEuler: - openEuler x86 (310p) - openEuler x86 (910b, ACL Graph) - openEuler aarch64 (310p) - openEuler aarch64 (910b, ACL Graph)

Read release notes

Version history

Version Updated Notes
b9222 Unknown hexagon: add support for TRI op (#22822) - Hexagon: TRI HVX Kernel addition to ggml hexagon HTP ops and context - addressed PR review comments for TRI op - hexagon: clang format - hex-unary: remove merge conflict markers...
b9204 Unknown feat: Support d_conv=15 for ssm-conv.cu (#23017) Branch: ModalityConditionalAdapters AI-usage: none Signed-off-by: Gabe Goodhart ghart@us.ibm.com macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, Kle...
b9190 Unknown server: (router) alloc tmp buffer on heap (#23159) macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS XCFramework Linux: - Ubuntu x64 (CPU) - Ubuntu arm64 (...
b9174 Unknown ui: Restructure repo to use tools/ui folder and ui / UI / llama-ui / LLAMA_UI naming (#23064) - webui: Move static build output from tools/server/public to build/ui directory - refactor: Move to tools/ui - refactor: rena...
b9159 Unknown ggml-hexagon: cpy: add contiguous fast-path in reshape copy (#23076) macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS XCFramework Linux: - Ubuntu x64 (CPU...
b9144 Unknown ggml-webgpu: only use subgroup-matrix path when head dims are divisible by sg_mat_k / sg_mat_n (#23020) macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS X...
b9128 Unknown hexagon: eliminate scalar VTCM loads via HVX splat helpers (#22993) - hexagon: add hvx_vec_repl helpers and use those for splat-from-vtcm usecase - hmx-mm: optimize per-group scale handling - hmx-fa: optimize slope load...
b9113 Unknown opencl: add q4_1 MoE for Adreno (#22856) - Q4_1 MoE CLC pass sanity check - remove unnecessary code - opencl: remove unnecessary asserts and reformat - opencl: fix supports_op for q4_1 moe - q4_1 moe is supported by Adre...
b9102 Unknown [SYCL] Add OP im2col_3d (#22903) - add im2col_3d - format code - update the ops.md macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS XCFramework Linux: - U...
b9093 Unknown model : add sarvam_moe architecture support (#20275) macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS XCFramework Linux: - Ubuntu x64 (CPU) - Ubuntu arm64...
b9085 Unknown Add flash attention MMA / Tiles to support MiMo-V2.5 (#22812) - mimo-v2.5: add flash attention mma/tiles for for d_kq=192 d_v=128 - mimo-v2.5: follow (256, 256) fattn templates - mimo-v2.5: cleanup comments - mimo-v2.5:...
b9070 Unknown opencl: add q4_0 MoE GEMM for Adreno (#22731) - Q4_0 MoE CLC pass sanity check - release program - opencl: fix whitespace - opencl: remove unused cl_program - opencl: break #if block to make it more clear - opencl: adjus...
b9049 Unknown mtmd : support MiniCPM-V 4.6 (#22529) - Support MiniCPM-V 4.6 in new branch Signed-off-by: tc-mb tianchi_cai@icloud.com - fix code bug Signed-off-by: tc-mb tianchi_cai@icloud.com - fix pre-commit Signed-off-by: tc-mb tia...
b9038 Unknown ggml : use CL_DEVICE_GLOBAL_MEM_SIZE as memory estimate for OpenCL --fit (#22688) - ggml : report estimated OpenCL memory for --fit Signed-off-by: Florian Reinle f.reinle@otec.de - ggml : estimated OpenCL memory backend...
b9026 Unknown ggml : implement fast walsh-hadamard transform for kv rotation (#21352) (#22631) macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS XCFramework Linux: - Ubu...
b9014 Unknown ggml-webgpu: add layer norm ops (#22406) - shader(norm): add layer norm ops - shader(norm): stablize floating point computation with Kahan summation and handle mixed types - shader(norm): remove the non-contiguous stride...
b9010 Unknown fix: CUDA device PCI bus ID de-dupe OOMing (ignoring other 3 gpus entirely) (#22533) - fix: CUDA device PCI bus ID detection for multi-GPU de-dupe - HIP, MUSA macros Co-authored-by: Johannes Gäßler johannesg@5d6.de macOS...
b9002 Unknown sync : ggml macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS XCFramework Linux: - Ubuntu x64 (CPU) - Ubuntu arm64 (CPU) - Ubuntu s390x (CPU) - Ubuntu x64...
b8994 Unknown ggml-webgpu: add the upscale shader (#22419) - shader(upscale): add the upscale shader with nearest, bilinear and bicubic implementations - shader(upscale): use macro macOS/iOS: - macOS Apple Silicon (arm64) - macOS Appl...
b8981 Unknown common : do not pass prompt tokens to reasoning budget sampler (#22488) macOS/iOS: - macOS Apple Silicon (arm64) - macOS Apple Silicon (arm64, KleidiAI enabled) - macOS Intel (x64) - iOS XCFramework Linux: - Ubuntu x64 (...