Multi-Model Routing: Cut AI Costs by 95%

The Problem

I spent $43 on Anthropic API costs in two weeks. Almost all of it on Claude Opus 4.6 — the pink bars on my cost dashboard were terrifying. Opus is the smartest model available, but it’s also the most expensive by a massive margin. And here’s the thing: 90% of what I was using it for didn’t need that level of intelligence.

Morning briefings? Didn’t need Opus. Bookmark processing? Didn’t need Opus. Quick questions over Telegram? Definitely didn’t need Opus. I was taking an Uber to check my mailbox.

This guide shows you exactly how to set up multi-model routing in OpenClaw so you use cheap models for routine tasks and only switch to Opus when you actually need it. The result: the same functionality at roughly 2% of the cost.

What You’ll End Up With
• MiniMax M2.5 as your default model (near-Opus quality, 50–62x cheaper)
• Automatic fallback chain: if one model fails, the next one picks up
• Manual switching: type /model opus when you need the best, /model mini to go back
• Access to 100+ models through a single OpenRouter API key
• Monthly API costs of $5–15 instead of $80+

Prerequisites

This guide assumes you already have OpenClaw running on a VPS with an Anthropic API key.
If you don’t, read my VPS setup guide first. Everything below builds on that foundation.

────────────────────────────────────────

Why Multi-Model Routing Matters

Right now, your OpenClaw sends every single message to one model. That’s like hiring a brain surgeon to do your laundry. The surgeon can do it, but it’s an absurd waste of money and talent.

Multi-model routing lets you assign different models to different tasks. Cheap models handle the routine stuff. Expensive models handle the hard stuff. And if any model goes down (rate limits, outages, errors), the system automatically falls through to the next model in your chain. You always get a response.

The Models Available Right Now

Here’s every model worth considering for OpenClaw in February 2026, ranked by cost:

The Cost Math

Let me make this concrete. Say you send 100 messages per day through OpenClaw, averaging 2,000 input tokens and 1,000 output tokens each. Here’s what that costs per month:

Read that last row. If you use MiniMax for 95% of your messages and only switch to Opus for the 5% that actually need it, your monthly cost drops from $202 to around $12. Most daily tasks don’t need Opus-level intelligence.

Why MiniMax M2.5 Specifically?

MiniMax M2.5 dropped on February 12, 2026 — four days ago. It scores 80.2% on SWE-Bench Verified, which is within 0.6% of Claude Opus 4.6. It runs at 80 tokens per second. It has a 204,800 token context window. And it costs $0.30 per million input tokens.
That’s 50x cheaper than Opus for input and 62x cheaper for output. For 90% of daily tasks, you will not notice a quality difference. This is the model that changes the economics of running an AI agent.

────────────────────────────────────────

Step 1: Create Your OpenRouter Account

OpenRouter is the key to everything. It’s a service that gives you access to 300+ AI models through a single API key. Instead of creating separate accounts with MiniMax, DeepSeek, Moonshot (Kimi), and others, you get one key that works with all of them.

Create the Account

How Long Will $5 Last?

At MiniMax M2.5 pricing ($0.30 input / $1.20 output per million tokens), $5 gets you roughly 3–4 million output tokens. That’s approximately 3,000–4,000 full conversations.
For context, $200 on Opus buys maybe 2,000 conversations. Same $5 on MiniMax = more conversations than your entire Opus spend.

Generate Your API Key

⚠️ Keep Your API Key Safe

Never share your OpenRouter API key. Never paste it in public. Never commit it to Git.
If it’s compromised, go to OpenRouter → Keys → delete the old key → create a new one.

────────────────────────────────────────

Step 2: Connect to Your VPS

Open Terminal on your Mac and SSH into your server:

ssh openclaw@YOUR_SERVER_IP

Replace YOUR_SERVER_IP with your Hetzner server’s IP address. Enter your password when prompted. You should now be logged in as the openclaw user.

Before we change anything, let’s stop the gateway cleanly:

openclaw gateway stop

This prevents any conflicts while we edit the configuration.

────────────────────────────────────────

Step 3: Edit Your OpenClaw Configuration

This is the core of the tutorial. We’re going to edit one file — openclaw.json — to add OpenRouter as a provider, add the models we want, and set up the routing logic.

Open the Config File

nano ~/.openclaw/openclaw.json

You’ll see a JSON file. It looks intimidating if you’ve never edited one before. Don’t worry — we’re only changing three sections.

JSON Basics (If You’ve Never Edited JSON)

JSON is just text organized with curly braces {} and square brackets [].
Every key has a value: "name": "value"
Sections are separated by commas. Missing a comma = the whole file breaks.
Use Ctrl+W to search for text in nano. This is how you’ll navigate the file.
ALWAYS make a backup before editing: cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.backup

Step 3a: Make a Backup First

Before you touch anything, back up the current config:

cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.backup

If anything goes wrong, you can restore it with:

cp ~/.openclaw/openclaw.json.backup ~/.openclaw/openclaw.json

Step 3b: Add OpenRouter as a Provider

In the config file, find the "providers" section. It probably looks something like this:

"providers": {

"anthropic": {
"apiKey": "sk-ant-api03-...",
...
}
}

You need to add OpenRouter as a second provider. Place your cursor after the closing brace of the anthropic section (after the }), add a comma, and add the OpenRouter block. The result should look like this:

"providers": {

"anthropic": {
"apiKey": "sk-ant-api03-...",
...
},
"openrouter": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "sk-or-v1-YOUR_OPENROUTER_KEY_HERE",
"api": "openai-completions",
"models": [
{
"id": "minimax/minimax-m2.5",
"name": "MiniMax M2.5"
},
{
"id": "moonshotai/kimi-k2.5",
"name": "Kimi K2.5"
},
{
"id": "deepseek/deepseek-chat-v3.2",
"name": "DeepSeek V3.2"
}
]
}
}

🚨 Critical: The "api" Value

The "api" field MUST be exactly "openai-completions" — not "openai-compatible", not "openai", not "openai-chat". This is the single most common config mistake. If you get it wrong, every request to OpenRouter will fail silently.

Step 3c: Set Up Model Routing

Now find the section where your model is configured. Look for something like "model" or "default" in the agents section. You need to set MiniMax as your primary model and create a fallback chain:

"model": {

"primary": "openrouter/minimax/minimax-m2.5",
"fallbacks": [
"openrouter/moonshotai/kimi-k2.5",
"openrouter/deepseek/deepseek-chat-v3.2",
"anthropic/claude-sonnet-4-20250514",
"anthropic/claude-opus-4-6"
]
}

This tells OpenClaw: use MiniMax M2.5 by default. If MiniMax fails (rate limit, outage, error), try Kimi K2.5. If Kimi fails, try DeepSeek. Then Sonnet. Then Opus as an absolute last resort. You always get a response.

Step 3d: Add Model Aliases

Aliases let you switch models quickly from Telegram or the TUI. Add this section:

"models": {

"openrouter/minimax/minimax-m2.5": { "alias": "mini" },
"openrouter/moonshotai/kimi-k2.5": { "alias": "kimi" },
"openrouter/deepseek/deepseek-chat-v3.2": { "alias": "deep" },
"anthropic/claude-sonnet-4-20250514": { "alias": "sonnet" },
"anthropic/claude-opus-4-6": { "alias": "opus" }
}

Now you can type /model mini, /model opus, /model kimi etc. from Telegram to switch on the fly.

Step 3e: Save the File

Save and exit nano:

Press Ctrl+X

Press Y (to confirm save)
Press Enter (to confirm filename)

────────────────────────────────────────

Step 4: Validate and Restart

This is the step most tutorials skip, and it’s the one that saves you hours of debugging. Before restarting the gateway, validate your config:

openclaw doctor --fix

This command does three things: checks your JSON syntax for errors, validates all config keys against the current OpenClaw schema, and automatically removes any unrecognized keys that would crash the gateway.

If you see errors, the most common causes are:

If Doctor Reports Errors
Don’t panic. Run this to see the exact error:
cat ~/.openclaw/openclaw.json | python3 -m json.tool
Python’s JSON parser gives you the exact line number of the syntax error. Fix it in nano, then run doctor again.
If it’s too broken, restore your backup:
cp ~/.openclaw/openclaw.json.backup ~/.openclaw/openclaw.json

Once doctor passes with no errors, restart the gateway:

openclaw gateway restart

Wait 5–10 seconds for the gateway to fully initialize.

────────────────────────────────────────

Step 5: Test Everything

Test 1: Check Model Status

openclaw models status

You should see your new models listed with their providers. Confirm that the primary model shows as MiniMax M2.5 via OpenRouter.

Test 2: Send a Message via TUI

openclaw tui

Type "hi" and wait for a response. Check the status bar at the bottom — it should show MiniMax M2.5 as the active model. If you get a response, MiniMax is working.

Test 3: Test Model Switching

In the TUI or Telegram, try these commands:

/model opus      → should switch to Claude Opus 4.6
/model mini      → should switch back to MiniMax M2.5
/model kimi      → should switch to Kimi K2.5
/model deep      → should switch to DeepSeek V3.2
/model sonnet    → should switch to Claude Sonnet 4.5

After each switch, send a quick message to confirm the model responds. Then switch back to mini — that’s your daily driver.

Test 4: Test via Telegram

Open Telegram, find your bot, and send "hi". Confirm it responds. Then try /model opus, send a message, and /model mini to switch back. If all of this works, you’re done.

What Success Looks Like
• Default messages go through MiniMax M2.5 (near-instant, cheap)
• /model opus switches to Opus for important tasks
• /model mini switches back to your cheap default
• If MiniMax has an outage, Kimi automatically takes over
• Your OpenRouter dashboard shows usage ticking up
• Your Anthropic dashboard shows usage dropping dramatically

────────────────────────────────────────

Step 6: How to Use This Day to Day

Now that everything is set up, here’s how to actually use multi-model routing in practice.

The 90/10 Rule

Use MiniMax (your default) for 90% of tasks. Only switch to Opus for the 10% that actually need it. Here’s how to decide:

Task
Use This Model
Why
Morning briefings
MiniMax (default)
Summarizing news doesn’t need frontier intelligence
Bookmark processing
MiniMax (default)
Categorization and extraction are straightforward
Quick Telegram questions
MiniMax (default)
Speed matters more than depth
Draft tweets and content
MiniMax (default)
First drafts don’t need Opus quality
Trading journal entries
MiniMax (default)
Logging trades is structured, not creative
Complex analysis
/model opus
Multi-step reasoning benefits from Opus
Creative writing
/model opus
Voice, nuance, and style need the best
Strategy documents
/model opus
Long-form structured thinking
Debugging config issues
/model opus
Complex troubleshooting
Final draft polish
/model opus
Switch to Opus for the last 10% of quality

The Switching Workflow

In practice, my daily workflow looks like this:

The /new Command Still Matters

Even with cheap models, long conversation histories eat tokens. Type /new between unrelated conversations. It clears the session history (not your memory or personality) and keeps each message lean. This is especially important if you’re still on a low Anthropic tier and occasionally switch to Opus.

────────────────────────────────────────

Troubleshooting

"Error: Invalid API key" from OpenRouter

Your OpenRouter key is wrong or wasn’t saved properly. Open the config, find your openrouter provider block, and check the apiKey value. Make sure it starts with sk-or-v1- and has no extra spaces or line breaks.

"Error: Model not found"

The model ID is wrong. The exact IDs as of February 2026:

minimax/minimax-m2.5 → MiniMax M2.5

moonshotai/kimi-k2.5 → Kimi K2.5
deepseek/deepseek-chat-v3.2 → DeepSeek V3.2
anthropic/claude-opus-4-6 → Claude Opus 4.6
anthropic/claude-sonnet-4-20250514 → Claude Sonnet 4.5

Copy-paste these exactly. A single wrong character means the model won’t load.

Gateway Won’t Start After Config Change

Your JSON is broken. Check syntax:

cat ~/.openclaw/openclaw.json | python3 -m json.tool

If it shows an error with a line number, open nano, go to that line (Ctrl+_ in nano, then type the line number), and fix it. Usually it’s a missing comma or mismatched brace.

If you can’t find the error, restore your backup:

cp ~/.openclaw/openclaw.json.backup ~/.openclaw/openclaw.json
openclaw gateway restart

Then try the edits again more carefully.

/model Command Not Switching

The aliases might not be configured correctly. Check that the "models" section in your config has the exact model path matching what’s in your providers section. The alias key must match the full model identifier including the provider prefix (e.g., "openrouter/minimax/minimax-m2.5", not just "minimax/minimax-m2.5").

MiniMax Responses Seem Lower Quality Than Expected

Two things to check. First, MiniMax M2.5 supports reasoning mode — make sure it’s enabled in your config if you want the model to think step-by-step before responding. Second, check that your SOUL.md isn’t too long. If your personality file is 3,000+ words, the model is spending most of its context window processing that instead of your actual question. Keep SOUL.md under 500 words.

────────────────────────────────────────

Monitor Your Costs

After switching, check both dashboards to confirm the savings:

OpenRouter Dashboard

Go to openrouter.ai → Activity. You’ll see every request, which model handled it, and the cost. This should be your primary dashboard now since most traffic goes through OpenRouter.

Anthropic Dashboard

Go to console.anthropic.com → Cost. Your daily bars should drop dramatically. You’ll still see Opus usage when you manually switch to it, but the volume should be 90% lower than before.

Expected Monthly Costs

────────────────────────────────────────

What to Add Next

Once you’re comfortable with this setup, there are a few more optimizations worth exploring:

Add More Models Over Time

OpenRouter gives you access to 300+ models. You can add new ones to your config anytime. Just add them to the models array in your openrouter provider section and create an alias. Some worth watching:

Task-Specific Model Assignment

The next level is configuring OpenClaw to automatically use different models for different types of tasks. For example: cron jobs (morning briefings) always use MiniMax, direct Telegram messages use MiniMax by default, and specific commands trigger Opus automatically. Check the OpenClaw docs for task-routing configuration — this varies by version.

SOUL.md Compression

If you haven’t already, compress your SOUL.md personality file. Mine went from 3,000 words to 360 words — an 85% reduction. Since SOUL.md is sent with every single message, this saves thousands of tokens per day regardless of which model you’re using. The savings compound with multi-model routing.

────────────────────────────────────────

The Bottom Line

You don’t have to choose between quality and cost. Multi-model routing gives you both.
MiniMax M2.5 at $0.30 per million input tokens delivers 95% of Opus quality for 2% of Opus cost. Use it for everything by default. Switch to Opus for the 10% of tasks where that last 5% of quality actually matters.
Your $200/week becomes $5/week. Same agent. Same capabilities. Same 24/7 availability. Just smarter about which brain handles which task.
Stop paying Opus prices for bookmark processing. Set this up tonight.

────────────────────────────────────────

Quick Reference Card

Pin this somewhere. These are the commands you’ll use every day:

Command
What It Does
/model mini
Switch to MiniMax M2.5 (cheap default)
/model opus
Switch to Claude Opus 4.6 (best quality)
/model sonnet
Switch to Claude Sonnet 4.5 (middle ground)
/model kimi
Switch to Kimi K2.5 (cheap alternative)
/model deep
Switch to DeepSeek V3.2 (cheapest)
/new
Clear conversation history (saves tokens)
openclaw models status
Check which model is active (via SSH)
openclaw doctor --fix
Validate config after any changes
openclaw gateway restart
Restart after config changes
openclaw logs --follow
Watch live logs for debugging

From $200/week to $5/week. Same agent, 97% cheaper.

@astergod

Multi-Model Routing: Cut AI Costs by 95%

The Problem

Why Multi-Model Routing Matters

The Models Available Right Now

The Cost Math

Step 1: Create Your OpenRouter Account

Create the Account

Generate Your API Key

Step 2: Connect to Your VPS

Step 3: Edit Your OpenClaw Configuration

Open the Config File

Step 3a: Make a Backup First

Step 3b: Add OpenRouter as a Provider

Step 3c: Set Up Model Routing

Step 3d: Add Model Aliases

Step 3e: Save the File

Step 4: Validate and Restart

Step 5: Test Everything

Test 1: Check Model Status

Test 2: Send a Message via TUI

Test 3: Test Model Switching

Test 4: Test via Telegram

Step 6: How to Use This Day to Day

The 90/10 Rule

The Switching Workflow

The /new Command Still Matters

Troubleshooting

"Error: Invalid API key" from OpenRouter

"Error: Model not found"

Gateway Won’t Start After Config Change

/model Command Not Switching

MiniMax Responses Seem Lower Quality Than Expected

Monitor Your Costs

OpenRouter Dashboard

Anthropic Dashboard

Expected Monthly Costs

What to Add Next

Add More Models Over Time

Task-Specific Model Assignment

SOUL.md Compression

Quick Reference Card

Keep reading

You Don’t Need the Cloud Anymore

Everything I Learned in One Week of AI

How to Make Your AI Agent Actually Useful

Keep reading.