A Mozilla project that packages a complete LLM and its runtime into a single executable file. Download one file, run it on Windows, Mac, or Linux with no installation, no dependencies, and no network connection required.
llamafile
docs.mozilla.ai/llamafile
Our take
The portable approach has a real advantage: copy one file to a USB drive, run it on any machine, and the model never touches the internet. Built on llama.cpp and Cosmopolitan Libc, it covers a wide range of hardware without configuration. The main limitation is the Windows 4 GB file cap: larger models need to load weights separately, which breaks the one-file promise slightly. For air-gapped use, travel, or anyone who wants zero-setup local AI, llamafile is the most self-contained option available.
GitHub at a glance
mozilla-ai/llamafile
Stars
25,065
+ 60 this week
Last commit
today
healthy
Latest release
0.10.3
19d ago
Stars, last 12d
Listed in
llamafile alternatives
Ollama Ollama lets you download and run large language models locally via a simple CLI and REST API. Supports a growing library of open models including Llama, Mistral, and Gemma on Windows, Mac, and Linux with no data sent to the cloud.
Ensu Offline AI chat from Ente that runs models entirely on your device, with no network calls required.
KoboldCpp KoboldCpp is a self-contained local AI inference tool with a built-in web UI. It runs GGUF text, image, and speech models with no installation beyond a single binary, primarily aimed at creative writing and roleplay workflows.