How to Use Ollama with Odysseus AI
Last updated: June 3, 2026
Odysseus is not an AI model.
It's a workspace and interface. Like a browser needs websites, Odysseus needs a model backend to do anything useful. This page explains your options.
Model Backends
Odysseus supports multiple inference backends. Pick one based on your hardware and priorities.
Ollama
Recommended for beginnersEasy local model serving. Install, pull a model, connect to Odysseus. Handles quantization and GPU offloading automatically.
vLLM
High-performance inference for NVIDIA GPUs. Best throughput for serving multiple users. Production-grade.
llama.cpp
CPU-optimized inference. Works without a GPU. Slower but runs on almost anything, including Raspberry Pi.
OpenRouter
No local hardware neededCloud API aggregator. Access 100+ models (Claude, GPT-4, Gemini, open-source) without local hardware. Pay per token.
OpenAI API
Use GPT-4o and other OpenAI models directly. Requires an API key.
Model Cookbook
Odysseus includes a built-in Model Cookbook with 270+ models. The Hardware Scanner detects your GPU, RAM, and storage, then recommends models that will actually run on your system. One-click download and serve, no terminal commands needed.
Find it in the Odysseus UI under Settings or the model selector.
Connecting Ollama to Odysseus
Step 1. Install Ollama
macOS/Windows: Download from ollama.com
Step 2. Pull a model
Takes a few minutes depending on model size and your connection.
Step 3. Make Ollama accessible to Docker
If Odysseus runs in Docker, Ollama needs to listen on all interfaces:
Skip this if both Odysseus and Ollama run natively (not in Docker).
Step 4. Add Ollama in Odysseus settings
Open Odysseus, go to Settings, and add a new model provider with the Ollama endpoint:
http://host.docker.internal:11434http://<host-ip>:11434http://localhost:11434Recommended Models by Hardware
8GB VRAM
Good for basic chat. Expect 10-20 tokens/sec.
16GB VRAM
Comfortable for daily use and code assistance.
24GB+ VRAM
Full capability with large quantized models.
No GPU
CPU inference is slow. Cloud APIs are the practical option.
See hardware requirements for a full breakdown with GPU model examples.
Using OpenRouter (Cloud Models)
OpenRouter is the easiest way to use Odysseus without local hardware. It aggregates 100+ models from multiple providers behind one API key.
Step 1. Create an OpenRouter account
Sign up at openrouter.ai and generate an API key.
Step 2. Add to Odysseus
In Odysseus Settings, add OpenRouter as a provider. Paste your API key. You'll get access to Claude, GPT-4, Gemini, Llama, Mistral, and many more.
Step 3. Pick a model and chat
Select any model from the model picker. Pricing is per-token and varies by model. Many open-source models have free tiers.
Model support changes frequently. Check the official GitHub repository for the latest supported backends and Cookbook updates.
Related Guides
Step-by-step installation for Docker, macOS, Windows, and Linux.
Fix common Odysseus issues: admin password, Docker errors, GPU problems.
GPU, RAM, and storage recommendations by model size.
True cost of running Odysseus locally vs. cloud AI subscriptions.