winget install --id imonoonoko.BitLlama
About BitLlama
BitLlama is a Pure Rust LLM inference engine featuring 1.58-bit ternary quantization, Test-Time Training (TTT), Soul learning system, MCP server/client, and private RAG. Supports Llama, Gemma, Mistral, Qwen, and BitNet models. OpenAI-compatible API server included.
What's new in 1.0.0
v1.0.0 — Final Release BitLlama v1.0.0. Development complete. What is BitLlama? A Pure Rust LLM inference engine with Soul learning and hierarchical memory. - 7 model architectures: Llama-2/3, Gemma-2/3, Qwen2.5, Mistral, BitNet - Soul learning: LoRA fine-tuning from conversations - Memory system: 4-layer hierarchical memory + 7-stage Sleep consolidation - Desktop GUI: Tauri 2.0 + Svelte 5, Japanese/English i18n - Performance: 45.4 tok/s (7B), 90% of llama.cpp - 1121 tests, quality score 9.0/10 Changes since v0.16.0 - CJK memory search fix (character bigram fallback for Japanese queries) - Soul learning tests (warmup, chat template, VRAM guard) - Chat template application fix for GGUF tokenizer fallback - README/ROADMAP updated to reflect project completion Install # Homebrew brew tap imonoonoko/bitllama && brew install bitllama # winget winget install imonoonoko.BitLlama # Or download binaries below Built with Rust by @imonoonoko Full Changelog: v0.16.0...v1.0.0
Version history
| Version | Updated | Notes |
|---|---|---|
| 1.0.0 | Unknown | v1.0.0 — Final Release BitLlama v1.0.0. Development complete. What is BitLlama? A Pure Rust LLM inference engine with Soul learning and hierarchical memory. - 7 model architectures: Llama-2/3, Gemma-2/3, Qwen2.5, Mistral... |
| 0.16.0 | Unknown | Full Changelog: v0.15.0...v0.16.0 |
| 0.15.0 | Unknown | Release notes |