llamafile

docs.mozilla.ai/llamafile

Open Source No account Windows macOS Linux Free

A Mozilla project that packages a complete LLM and its runtime into a single executable file. Download one file, run it on Windows, Mac, or Linux with no installation, no dependencies, and no network connection required.

Our take

The portable approach has a real advantage: copy one file to a USB drive, run it on any machine, and the model never touches the internet. Built on llama.cpp and Cosmopolitan Libc, it covers a wide range of hardware without configuration. The main limitation is the Windows 4 GB file cap: larger models need to load weights separately, which breaks the one-file promise slightly. For air-gapped use, travel, or anyone who wants zero-setup local AI, llamafile is the most self-contained option available.

GitHub at a glance

mozilla-ai/llamafile

Stars

25,065

+ 60 this week

Last commit

today

healthy

Latest release

0.10.3

19d ago

Stars, last 12d

Listed in

Local AI

llamafile alternatives

Ollama Ollama lets you download and run large language models locally via a simple CLI and REST API. Supports a growing library of open models including Llama, Mistral, and Gemma on Windows, Mac, and Linux with no data sent to the cloud.

Ensu Offline AI chat from Ente that runs models entirely on your device, with no network calls required.

KoboldCpp KoboldCpp is a self-contained local AI inference tool with a built-in web UI. It runs GGUF text, image, and speech models with no installation beyond a single binary, primarily aimed at creative writing and roleplay workflows.