AE Agent

Using local models (Ollama / LM Studio)

Run an AI model entirely on your own machine — free, private, and offline. AE Agent connects to Ollama, LM Studio, or any OpenAI-compatible endpoint running locally. Nothing leaves your computer.

Local models are great for privacy and zero cost, but tool-use quality varies by model. For Agent mode, use a capable recent model — llama3.1 or newer, or qwen2.5 or newer. Smaller or older models may struggle to call tools reliably.

Ollama

1

Install Ollama

Download and install Ollama for your OS from ollama.com/download. It runs a local server at http://localhost:11434.
2

Pull and run a model

From a terminal, pull a tool-capable model and start it:
ollama run llama3.1
The first run downloads the model. After that it stays available locally.
3

Connect it in AE Agent

Open Settings → Local AI, set the URL to:
http://localhost:11434
Click Detect models. AE Agent lists every model you've pulled. Pick one — it appears in the model picker under Local.

LM Studio

1

Install and load a model

Install LM Studio from lmstudio.ai, download a model from its library, and load it.
2

Start the local server

In LM Studio, open the Local Server tab and start it. It exposes an OpenAI-compatible endpoint at http://localhost:1234.
3

Connect it in AE Agent

In Settings → Local AI, set the URL to:
http://localhost:1234
Click Detect models and select your loaded model.

Custom OpenAI-compatible endpoints

Any server that speaks the OpenAI chat-completions API works the same way. Enter its base URL in Settings → Local AI and click Detect models. This covers self-hosted runtimes like llama.cpp's server, vLLM, and similar.

Default endpoints

Ollamahttp://localhost:11434
LM Studiohttp://localhost:1234
CustomYour OpenAI-compatible base URL
If Detect models finds nothing, confirm the local server is actually running and the URL/port matches. A model also has to be pulled (Ollama) or loaded (LM Studio) before it shows up.