Ollama
Ollama lets you download and run large language models locally via a simple CLI and REST API. Supports a growing library of open models including Llama, Mistral, and Gemma on Windows, Mac, and Linux with no data sent to the cloud.
The Local AI we recommend, filtered to Windows. See all Local AI →
Ollama lets you download and run large language models locally via a simple CLI and REST API. Supports a growing library of open models including Llama, Mistral, and Gemma on Windows, Mac, and Linux with no data sent to the cloud.
A Mozilla project that packages a complete LLM and its runtime into a single executable file. Download one file, run it on Windows, Mac, or Linux with no installation, no dependencies, and no network connection required.
Offline AI chat from Ente that runs models entirely on your device, with no network calls required.
KoboldCpp is a self-contained local AI inference tool with a built-in web UI. It runs GGUF text, image, and speech models with no installation beyond a single binary, primarily aimed at creative writing and roleplay workflows.