$1 free credit on signup · No credit card · First call in 5 minutes

Automatically choose the
right model for every LLM call

Costr evaluates request complexity, then chooses a lower-cost model that still meets the quality bar. Simple tasks no longer waste premium models, and complex tasks still have high-quality fallback.

Start using Free demo

Auto Routing Result

Routing statusQuality: passed

Complexity

Low

Baseline modelPremium ($25/MTok)

Costr selectedEconomy ($3/MTok)

Estimated savings

88%

Why LLM costs keep getting out of control

Simple tasks still use premium models

Classification, extraction, and light summaries often do not need expensive models, but many systems route everything to premium by default.

Multi-step agents multiply cost

One user request can trigger many model calls. If every step uses a premium model, cost compounds quickly.

Manual model selection does not scale

As model catalogs grow, developers cannot keep hand-maintaining the best model for every task.

Cheap models have quality risk

Blindly using low-cost models can hurt output quality, so routing needs quality floors and fallback.

How Costr automatically lowers cost

Receive request

Compatible with OpenAI / Anthropic SDKs. Only change base_url.

Estimate complexity

Classify whether the request is low, medium, or high complexity.

Select model tier

Low-complexity requests prefer Economy, normal requests use Standard, and complex requests use Premium.

Choose from model pool

Select a low-cost model that clears the quality threshold inside the matching tier pool.

Costr does not blindly pick cheap models. It chooses lower-cost models inside a quality threshold.

Core capabilities

Smart routing

Automatically selects Economy, Standard, or Premium based on request complexity.

Cost savings

Simple extraction, classification, and light summarization avoid wasting premium models.

Quality protection

Quality floors, automatic upgrades, and fallback keep savings from damaging critical outputs.

SDK compatible

Works with OpenAI and Anthropic SDKs. Usually you only change base_url and use model="auto".

Observable calls

Every request records complexity, tier, model, cost, savings, and status for debugging.

Configurable pools

Users choose which models are available in each tier; Costr scores within that pool.

Route each workflow step by task difficulty

Pick a scenario or enter your own prompt in the free demo to compare a baseline model with Costr routing.

Try this

Customer service Agent

Intent detection, order extraction, and reply drafting do not all need premium models.

Best for: intent detection / extraction / reply drafting

Try this

Code Agent

Code explanation, test generation, debugging, and edits can route by difficulty.

Best for: code explanation / tests / debugging

Try this

Market research Agent

Search summaries, extraction, comparisons, and reports can reduce repeated-call cost.

Best for: extraction / source comparison / report summaries

Try this

Data extraction

Field extraction, classification, and format conversion usually fit low-cost models.

Best for: fields / classification / format conversion

Try this

Document summaries

Long summaries, meeting notes, and knowledge-base tasks can route by complexity.

Best for: summaries / action items / knowledge organization

Try this

Web3 security analysis

Token risk explanations, contract summaries, and wallet behavior can route by risk.

Best for: risk explanation / security summaries / wallet behavior

Want to see what your workflow can save?

Pick a scenario or enter your own prompt in the free demo to compare a baseline model with Costr routing.

Open free demo

Three tiers for different task complexity

Use Auto mode to let Costr choose a tier for each request, or specify a tier with the model parameter.

Economy

$3 / MTok

Best for:

For simple Q&A, extraction, classification, format conversion, and light summaries.

Standard

$12 / MTok

Best for:

For code generation, summarization, data analysis, medium reasoning, and business tasks.

Premium

$25 / MTok

Best for:

For complex reasoning, architecture, deep research, and high-risk decisions.

These are reference prices for tier-routed requests. The actual model is selected from the enabled pool for that tier.

Make your first call in 5 minutes

1Register and claim free credit

2Create an API key

3Change the SDK base_url

4Use model="auto"

5Review results in the dashboard

example.py

from openai import OpenAI

# Initialize with Costr as the base URL
client = OpenAI(
  base_url="https://costr.gopluslabs.io/v1",
  api_key="cr-your-key"
)

# Use auto mode to trigger smart routing
response = client.chat.completions.create(
  model="auto",
  messages=[
    {"role": "user", "content": "Extract the invoice amount: total is $299"}
  ]
)

print(response.choices[0].message.content)

Savings without quality sacrifice

Quality floors

Harder tasks require stronger candidate models; weak candidates are filtered out.

Automatic fallback

If an upstream model fails, Costr retries available candidates to reduce single-model failures.

Automatic upgrade

High-complexity or high-risk tasks route to stronger tiers when quality matters.

Visible results

The dashboard shows complexity, tier, model, cost, and savings for each call.

Ready to make every LLM call cheaper?

Start with a free demo and see how much your Agent, support, coding, or document workflows can save.

Start using Free demo

Automatically choose the right model for every LLM call

Why LLM costs keep getting out of control

Simple tasks still use premium models

Multi-step agents multiply cost

Manual model selection does not scale

Cheap models have quality risk

How Costr automatically lowers cost

Receive request

Estimate complexity

Select model tier

Choose from model pool

Core capabilities

Smart routing

Cost savings

Quality protection

SDK compatible

Observable calls

Configurable pools

Route each workflow step by task difficulty

Customer service Agent

Code Agent

Market research Agent

Data extraction

Document summaries

Web3 security analysis

Want to see what your workflow can save?

Three tiers for different task complexity

Economy

Standard

Premium

Make your first call in 5 minutes

Savings without quality sacrifice

Quality floors

Automatic fallback

Automatic upgrade

Visible results

Ready to make every LLM call cheaper?

Automatically choose the
right model for every LLM call