Automatically choose the
right model for every LLM call
Costr evaluates request complexity, then chooses a lower-cost model that still meets the quality bar. Simple tasks no longer waste premium models, and complex tasks still have high-quality fallback.
Why LLM costs keep getting out of control
Simple tasks still use premium models
Classification, extraction, and light summaries often do not need expensive models, but many systems route everything to premium by default.
Multi-step agents multiply cost
One user request can trigger many model calls. If every step uses a premium model, cost compounds quickly.
Manual model selection does not scale
As model catalogs grow, developers cannot keep hand-maintaining the best model for every task.
Cheap models have quality risk
Blindly using low-cost models can hurt output quality, so routing needs quality floors and fallback.
How Costr automatically lowers cost
Receive request
Compatible with OpenAI / Anthropic SDKs. Only change base_url.
Estimate complexity
Classify whether the request is low, medium, or high complexity.
Select model tier
Low-complexity requests prefer Economy, normal requests use Standard, and complex requests use Premium.
Choose from model pool
Select a low-cost model that clears the quality threshold inside the matching tier pool.
Core capabilities
Smart routing
Automatically selects Economy, Standard, or Premium based on request complexity.
Cost savings
Simple extraction, classification, and light summarization avoid wasting premium models.
Quality protection
Quality floors, automatic upgrades, and fallback keep savings from damaging critical outputs.
SDK compatible
Works with OpenAI and Anthropic SDKs. Usually you only change base_url and use model="auto".
Observable calls
Every request records complexity, tier, model, cost, savings, and status for debugging.
Configurable pools
Users choose which models are available in each tier; Costr scores within that pool.
Route each workflow step by task difficulty
Pick a scenario or enter your own prompt in the free demo to compare a baseline model with Costr routing.
Customer service Agent
Intent detection, order extraction, and reply drafting do not all need premium models.
Best for: intent detection / extraction / reply drafting
Try thisCode Agent
Code explanation, test generation, debugging, and edits can route by difficulty.
Best for: code explanation / tests / debugging
Try thisMarket research Agent
Search summaries, extraction, comparisons, and reports can reduce repeated-call cost.
Best for: extraction / source comparison / report summaries
Try thisData extraction
Field extraction, classification, and format conversion usually fit low-cost models.
Best for: fields / classification / format conversion
Try thisDocument summaries
Long summaries, meeting notes, and knowledge-base tasks can route by complexity.
Best for: summaries / action items / knowledge organization
Try thisWeb3 security analysis
Token risk explanations, contract summaries, and wallet behavior can route by risk.
Best for: risk explanation / security summaries / wallet behavior
Want to see what your workflow can save?
Pick a scenario or enter your own prompt in the free demo to compare a baseline model with Costr routing.
Open free demoThree tiers for different task complexity
Use Auto mode to let Costr choose a tier for each request, or specify a tier with the model parameter.
Economy
Best for:
For simple Q&A, extraction, classification, format conversion, and light summaries.
Standard
Best for:
For code generation, summarization, data analysis, medium reasoning, and business tasks.
Premium
Best for:
For complex reasoning, architecture, deep research, and high-risk decisions.
These are reference prices for tier-routed requests. The actual model is selected from the enabled pool for that tier.
Make your first call in 5 minutes
from openai import OpenAI
# Initialize with Costr as the base URL
client = OpenAI(
base_url="https://costr.gopluslabs.io/v1",
api_key="cr-your-key"
)
# Use auto mode to trigger smart routing
response = client.chat.completions.create(
model="auto",
messages=[
{"role": "user", "content": "Extract the invoice amount: total is $299"}
]
)
print(response.choices[0].message.content)Savings without quality sacrifice
Quality floors
Harder tasks require stronger candidate models; weak candidates are filtered out.
Automatic fallback
If an upstream model fails, Costr retries available candidates to reduce single-model failures.
Automatic upgrade
High-complexity or high-risk tasks route to stronger tiers when quality matters.
Visible results
The dashboard shows complexity, tier, model, cost, and savings for each call.
Ready to make every LLM call cheaper?
Start with a free demo and see how much your Agent, support, coding, or document workflows can save.