$1 free credit on signup · No credit card · First call in 5 minutes

Automatically choose the
right model for every LLM call

Costr evaluates request complexity, then chooses a lower-cost model that still meets the quality bar. Simple tasks no longer waste premium models, and complex tasks still have high-quality fallback.

Auto Routing Result
Routing statusQuality: passed
Complexity
Low
Baseline modelPremium ($25/MTok)
Costr selectedEconomy ($3/MTok)
Estimated savings
88%

Why LLM costs keep getting out of control

Simple tasks still use premium models

Classification, extraction, and light summaries often do not need expensive models, but many systems route everything to premium by default.

Multi-step agents multiply cost

One user request can trigger many model calls. If every step uses a premium model, cost compounds quickly.

Manual model selection does not scale

As model catalogs grow, developers cannot keep hand-maintaining the best model for every task.

Cheap models have quality risk

Blindly using low-cost models can hurt output quality, so routing needs quality floors and fallback.

How Costr automatically lowers cost

1

Receive request

Compatible with OpenAI / Anthropic SDKs. Only change base_url.

2

Estimate complexity

Classify whether the request is low, medium, or high complexity.

3

Select model tier

Low-complexity requests prefer Economy, normal requests use Standard, and complex requests use Premium.

4

Choose from model pool

Select a low-cost model that clears the quality threshold inside the matching tier pool.

Costr does not blindly pick cheap models. It chooses lower-cost models inside a quality threshold.

Core capabilities

Smart routing

Automatically selects Economy, Standard, or Premium based on request complexity.

Cost savings

Simple extraction, classification, and light summarization avoid wasting premium models.

Quality protection

Quality floors, automatic upgrades, and fallback keep savings from damaging critical outputs.

SDK compatible

Works with OpenAI and Anthropic SDKs. Usually you only change base_url and use model="auto".

Observable calls

Every request records complexity, tier, model, cost, savings, and status for debugging.

Configurable pools

Users choose which models are available in each tier; Costr scores within that pool.

Three tiers for different task complexity

Use Auto mode to let Costr choose a tier for each request, or specify a tier with the model parameter.

Economy

$3 / MTok

Best for:

For simple Q&A, extraction, classification, format conversion, and light summaries.

Standard

$12 / MTok

Best for:

For code generation, summarization, data analysis, medium reasoning, and business tasks.

Premium

$25 / MTok

Best for:

For complex reasoning, architecture, deep research, and high-risk decisions.

These are reference prices for tier-routed requests. The actual model is selected from the enabled pool for that tier.

Make your first call in 5 minutes

1Register and claim free credit
2Create an API key
3Change the SDK base_url
4Use model="auto"
5Review results in the dashboard
Register and get an API key
example.py
from openai import OpenAI

# Initialize with Costr as the base URL
client = OpenAI(
  base_url="https://costr.gopluslabs.io/v1",
  api_key="cr-your-key"
)

# Use auto mode to trigger smart routing
response = client.chat.completions.create(
  model="auto",
  messages=[
    {"role": "user", "content": "Extract the invoice amount: total is $299"}
  ]
)

print(response.choices[0].message.content)

Savings without quality sacrifice

Quality floors

Harder tasks require stronger candidate models; weak candidates are filtered out.

Automatic fallback

If an upstream model fails, Costr retries available candidates to reduce single-model failures.

Automatic upgrade

High-complexity or high-risk tasks route to stronger tiers when quality matters.

Visible results

The dashboard shows complexity, tier, model, cost, and savings for each call.

Ready to make every LLM call cheaper?

Start with a free demo and see how much your Agent, support, coding, or document workflows can save.