AE Agent

Using local models (Ollama / LM Studio)

Run an AI model entirely on your own machine — free, private, and offline. AE Agent connects to Ollama, LM Studio, or any OpenAI-compatible endpoint running locally. Nothing leaves your computer.

Local models are great for privacy and zero cost, but tool-use quality varies by model. For Agent mode, use a capable recent model — llama3.1 or newer, or qwen2.5 or newer. Smaller or older models may struggle to call tools reliably.

Ollama

Install Ollama

Download and install Ollama for your OS from ollama.com/download. It runs a local server at http://localhost:11434.

Pull and run a model

From a terminal, pull a tool-capable model and start it:

ollama run llama3.1

The first run downloads the model. After that it stays available locally.

Connect it in AE Agent

Open Settings → Local AI, set the URL to:

http://localhost:11434

Click Detect models. AE Agent lists every model you've pulled. Pick one — it appears in the model picker under Local.

LM Studio

Install and load a model

Install LM Studio from lmstudio.ai, download a model from its library, and load it.

Start the local server

In LM Studio, open the Local Server tab and start it. It exposes an OpenAI-compatible endpoint at http://localhost:1234.

Connect it in AE Agent

In Settings → Local AI, set the URL to:

http://localhost:1234

Click Detect models and select your loaded model.

Custom OpenAI-compatible endpoints

Any server that speaks the OpenAI chat-completions API works the same way. Enter its base URL in Settings → Local AI and click Detect models. This covers self-hosted runtimes like llama.cpp's server, vLLM, and similar.

Default endpoints

Ollama	http://localhost:11434
LM Studio	http://localhost:1234
Custom	Your OpenAI-compatible base URL

If Detect models finds nothing, confirm the local server is actually running and the URL/port matches. A model also has to be pulled (Ollama) or loaded (LM Studio) before it shows up.

Connecting a provider Usage & cost