API Reference
Ce contenu n’est pas encore disponible dans votre langue.
Endpoint
Auto Route
POST https://api.greatrouterai.com/v1/auto/routeClassifies your prompt, selects the optimal model, and proxies the request. This is the primary endpoint for most use cases.
Explicit Route
POST https://api.greatrouterai.com/v1/models/routeRoutes to a model matching your explicit filters. Use this when you know the task type and want fine-grained control.
OpenAI-Compatible Chat Completions
POST https://api.greatrouterai.com/v1/chat/completionsDrop-in shim for OpenAI SDK clients. Set baseURL to https://api.greatrouterai.com/v1 and use model: "router" for auto-routing, or pass an explicit model id.
{ "model": "router", "messages": [{ "role": "user", "content": "Write a haiku about routing" }]}Response follows OpenAI chat.completion shape with a greatrouter metadata object (classified task, routing reason, fallback chain, estimated cost). Set stream: true for token-by-token SSE streaming on supported models. /v1/auto/route returns complete responses only.
Optional headers:
X-GR-Cache-TTL— cache identical prompts (seconds, max 3600)X-GR-Guardrails: moderate— block obvious PII patterns before inference
Auto Suggest
POST https://api.greatrouterai.com/v1/auto/suggestReturns model suggestions without actually routing or proxying a request. Useful for showing users what model would be selected. Each suggestion includes estimated_cost_dollars.
Price Comparison
POST https://api.greatrouterai.com/v1/models/price-comparisonCompare model prices for a given prompt/task. Returns ranked suggestions with cost estimates and price comparison metadata.
Preferences
GET https://api.greatrouterai.com/v1/preferencesPUT https://api.greatrouterai.com/v1/preferences/:keyDELETE https://api.greatrouterai.com/v1/preferences/:keyGET https://api.greatrouterai.com/v1/preferences/suggestionsManage per-organization routing preferences (default optimization, excluded providers, preferred providers, etc.).
Model Health
GET https://api.greatrouterai.com/v1/model-healthCheck model health status (degradation, latency penalties). Query param models accepts comma-separated model IDs.
Activity Logs Export
GET https://api.greatrouterai.com/v1/org/logs?format=csv&limit=1000GET https://api.greatrouterai.com/v1/org/logs?format=json&limit=1000Session cookie auth (dashboard) or Bearer API key. CSV export includes routing reason and fallback chain columns.
Polar Webhook
POST https://api.greatrouterai.com/v1/webhooks/polarCredits wallet on checkout completion. Configure POLAR_WEBHOOK_SECRET and point Polar.sh to this URL.
Authentication
All protected requests require a Bearer API key in the Authorization header:
Authorization: Bearer pk_live_...Test keys use the pk_test_ prefix (pk_test_{9 chars}_{32 hex chars}).
Auto Route Request
| Field | Type | Default | Description |
|---|---|---|---|
prompt | string | — (required) | Natural language description of what you want |
input | object | — (required) | Model input (messages, parameters, etc.) |
task | string | — | Hint: text, image, video, music, speech, code, web_search |
content_mode | string | — | generate, edit, or combine |
optimization | string | "balanced" | price-optimized, output-optimized, or balanced |
mode | string | "auto" | Classification mode: auto or regex |
session_id | string | auto | Session ID for context tracking |
budget_dollars | number | — | Maximum estimated cost in USD; models exceeding this are filtered out |
taxonomy | string | — | Catalog category (translation, llm, image-generation, …) — auto-pick best model in category |
provider | string | — | Provider slug within taxonomy (google, meta, …) |
catalog_family | string | — | Catalog model line id — auto-pick best routable variant |
model | string | — | Explicit model id (provider/model) — skips ranking |
feedback | object | — | Feedback object with correct (boolean) and suggested_task (string) |
Auto Suggest Request
| Field | Type | Default | Description |
|---|---|---|---|
prompt | string | — (required) | Natural language description |
input | object | — | Optional model input for multimodal profiling and schema validation |
task | string | — | Task hint |
content_mode | string | — | generate, edit, or combine |
optimization | string | "balanced" | Optimization mode |
limit | integer | 5 | Number of suggestions (1–20) |
session_id | string | auto | Session ID for context tracking |
budget_dollars | number | — | Maximum estimated cost in USD; models exceeding this are filtered out |
taxonomy | string | — | Catalog category filter |
provider | string | — | Provider filter within taxonomy |
catalog_family | string | — | Catalog model line filter |
model | string | — | Explicit model id |
Price Comparison Request
| Field | Type | Default | Description |
|---|---|---|---|
prompt | string | — | Optional prompt for classification |
task | string | text | Task type hint |
optimization | string | "balanced" | Optimization mode |
budget_dollars | number | — | Maximum budget in USD |
Explicit Route Request
| Field | Type | Default | Description |
|---|---|---|---|
query | string | — (required) | Task type or description to match |
input | object | — (required) | Model input (messages, parameters, etc.) |
maxCost | string | — | Price tier: economy, standard, balanced, premium, flagship |
filter.task_type | string | — | Registry task type (e.g. "text_generation") |
filter.capabilities | array | — | Required capabilities (e.g. ["vision", "reasoning"]) |
Preferences
GET /v1/preferences
Returns the routing preferences for the authenticated organization. On database errors, returns safe defaults (balanced optimization, empty exclusion lists) instead of failing the request.
PUT /v1/preferences/:key
Sets a routing preference. Keys: default_optimization, default_task, excluded_providers, excluded_models, preferred_providers, content_mode_default.
DELETE /v1/preferences/:key
Deletes a routing preference.
GET /v1/preferences/suggestions
Returns AI-suggested preferences based on routing feedback and usage patterns.
Model Health
GET /v1/model-health?models=model1,model2
Returns health status for specified models (or all if models param omitted). Includes degradation status and latency penalties.
Response
Auto Route Response
{ "classification": { "task": "text", "confidence": 0.95, "reasoning": "Prompt is a general text question", "decision": "regex", "content_mode": "generate", "complexity_tier": "SIMPLE", "complexity_score": 0.18 }, "model": { "model": { "id": "meta/llama-3.3-70b-instruct-fp8-fast", "name": "LLaMA 3.3 70B Instruct", "provider": "meta", "task_type": "text_generation", "category": "llm", "capabilities": ["text_generation", "coding"], "pricing": { "input_per_million_tokens": 0.10, "output_per_million_tokens": 0.10 } }, "cost_tier": "economy", "estimated_cost": 0.0001 }, "price_comparison": { "cheapest_model": "meta/llama-3.3-70b-instruct-fp8-fast", "best_value_model": "google/gemini-2.5-flash", "best_quality_model": "anthropic/claude-sonnet-4", "estimated_cost_dollars": 0.0001, "savings_vs_highest_pct": 85 }, "result": { "choices": [ { "message": { "role": "assistant", "content": "The capital of France is Paris." } } ], "usage": { "prompt_tokens": 14, "completion_tokens": 8, "total_tokens": 22 } }, "timing": { "classification_ms": 12, "routing_ms": 3, "inference_ms": 245, "total_ms": 260 }, "context": { "session_prompts": 1, "contextual_boost": 0.15 }}Auto Suggest Response
{ "classification": { "task": "image", "confidence": 0.92, "reasoning": "Prompt requests image generation", "decision": "regex", "content_mode": "generate", "complexity_tier": "MEDIUM", "complexity_score": 0.42 }, "optimization": "balanced", "suggestions": [ { "model": { "id": "...", "name": "...", "provider": "...", "task_type": "..." }, "cost_tier": "economy", "estimated_cost": 0.02, "estimated_cost_dollars": 0.02, "score": 0.85 } ], "price_comparison": { "cheapest_model": "...", "best_value_model": "...", "best_quality_model": "...", "savings_vs_highest_pct": 75 }}Price Comparison Response
{ "classification": { "task": "text", "confidence": 0.9, "reasoning": "..." }, "optimization": "balanced", "suggestions": [ { "model_id": "meta/llama-3.3-70b-instruct-fp8-fast", "name": "LLaMA 3.3 70B Instruct", "provider": "meta", "task_type": "text_generation", "cost_tier": "economy", "score": 0.85, "estimated_cost_dollars": 0.0001 } ], "price_comparison": { "cheapest_model": "...", "best_value_model": "...", "best_quality_model": "...", "savings_vs_highest_pct": 85 }}Model Health Response
{ "health": [ { "model_id": "openai/gpt-5", "degraded": false, "latency_penalty": 0, "health_entry": { "failures": 0, "successes": 15, "consecutiveFailures": 0, "avgLatencyMs": 320 } } ], "tenant_id": "org_abc123"}Explicit Route Response
{ "model": { "id": "meta/llama-3.3-70b-instruct-fp8-fast", "name": "LLaMA 3.3 70B Instruct", "provider": "meta", "task_type": "text_generation", "category": "llm", "capabilities": ["text_generation", "coding"], "pricing": { "input_per_million_tokens": 0.10, "output_per_million_tokens": 0.10 } }, "cost_tier": "economy", "estimated_cost": 0.0001, "result": { ... }, "timing": { "routing_ms": 2, "inference_ms": 198, "total_ms": 200 }}List models
GET https://api.greatrouterai.com/v1/modelsQuery parameters:
| Parameter | Description |
|---|---|
task_type | Filter by task type (e.g. text_generation) |
provider | Filter by provider (e.g. openai) |
category | Filter by category (e.g. llm, image) |
capabilities | Comma-separated capabilities (e.g. vision,reasoning) |
type | Filter by type: hosted or proxied |
catalog_family | Filter by catalog model line id |
catalog_visibility | Filter by catalog visibility tier |
routable_only | true to exclude long-tail models |
Returns all matching models with metadata.
Model query
POST https://api.greatrouterai.com/v1/models/queryPaginated model search. POST body accepts list filters plus:
| Field | Type | Description |
|---|---|---|
price_tier | string | economy, standard, balanced, premium, flagship |
tags | string[] | Match models with any of these tags |
text | string | Full-text search across name, description, provider |
catalog_family | string | Filter by catalog model line id |
catalog_visibility | string | Filter by visibility (featured, standard, long_tail, …) |
routable_only | boolean | When true, excludes long-tail catalog models |
limit | integer | Page size (max 500, default 100) |
offset | integer | Pagination offset |
Catalog browse
Public catalog taxonomy endpoints for browsing the model catalog by model line:
GET https://api.greatrouterai.com/v1/catalog/taxonomyGET https://api.greatrouterai.com/v1/catalog/familiesGET https://api.greatrouterai.com/v1/catalog/families/:idGET /v1/catalog/families query parameters:
| Parameter | Description |
|---|---|
taxonomy | Filter by taxonomy branch |
offset | Pagination offset (default 0) |
limit | Page size (default 24, max 100) |
routable_only | true to hide long-tail model lines |
q | Text search across model line names |
Get a specific model
GET https://api.greatrouterai.com/v1/models/:provider/:modelReturns full model metadata. Model IDs use the format provider/model-name.
When available, the response includes model intelligence fields:
| Field | Description |
|---|---|
provenance | Creator, license, and source metadata from the catalog intelligence index |
benchmarks | Published benchmark scores (when indexed) |
performance_timeseries | Rolling latency/success windows from production routing (requires DB; omitted when unavailable) |
catalog_family | Model line id for hierarchical routing (taxonomy/provider--family) |
catalog_visibility | featured, standard, or long_tail |
price_tier | economy, standard, balanced, premium, or flagship |
Recommend models
POST https://api.greatrouterai.com/v1/models/recommendSemantic or task-based model recommendations. Accepts router tasks (text, image, …) via task (not only registry task_type strings). When prompt is provided, uses semantic search over model descriptions.
| Field | Type | Description |
|---|---|---|
task | string | Router task hint (text, image, video, …) |
prompt | string | Optional prompt for semantic ranking |
limit | integer | Max results (default 5) |
optimization | string | price-optimized, balanced, or output-optimized |
Other model endpoints
| Endpoint | Description |
|---|---|
GET /v1/models/:provider/:model/input-schema | Get the input schema for a model |
GET /v1/models/:provider/:model/output-schema | Get the output schema for a model |
POST /v1/models/:provider/:model/validate-input | Validate input against a model’s schema |
POST /v1/models/:provider/:model/validate-output | Validate output against a model’s schema |
GET /v1/models/tags | List all available tags |
GET /v1/models/price-tiers | List all price tiers |
Feedback
POST https://api.greatrouterai.com/v1/auto/feedbackSubmit feedback on routing accuracy to improve the classifier.
| Field | Type | Description |
|---|---|---|
prompt | string | The original prompt |
classified_task | string | The task the router classified |
correct | boolean | Whether the classification was correct |
confidence | number | Confidence score (0–1) |
suggested_task | string | (optional) What the correct task should have been |
Money conventions
All API-facing fields use USD dollars (e.g. budget_dollars, estimated_cost_dollars). Internal services and database columns use cents (e.g. budget_cents, estimated_cost_cents, cost_cents). The API layer converts between the two representations automatically.
SDKs
GreatRouter provides official SDKs for TypeScript and Python.
TypeScript
npm install @greatrouter/sdkimport { GreatRouter } from '@greatrouter/sdk'
const client = new GreatRouter('pk_live_...')
// Auto routeconst result = await client.autoRoute({ prompt: 'Generate an image of a cat', input: { messages: [{ role: 'user', content: 'Generate an image of a cat' }] }, task: 'image', optimization: 'price-optimized', budget_dollars: 0.05,})
// Auto suggest (no inference)const suggestions = await client.autoSuggest({ prompt: 'Write a short poem', task: 'text', limit: 5, budget_dollars: 0.01,})
// Compare pricesconst comparison = await client.priceComparison({ task: 'text', optimization: 'balanced',})
// List modelsconst models = await client.listModels({ task_type: 'text_generation' })
// Get model detailsconst model = await client.getModel('meta', 'llama-3.3-70b-instruct-fp8-fast')
// Preferencesconst prefs = await client.getPreferences()await client.setPreference({ key: 'default_optimization', value: 'price-optimized' })const suggestions = await client.getPreferenceSuggestions()
// Model healthconst health = await client.getModelHealth(['openai/gpt-5', 'anthropic/claude-sonnet-4'])Python
pip install greatrouterfrom greatrouter import GreatRouter
client = GreatRouter("pk_live_...")
# Auto routeresult = client.auto_route({ "prompt": "Generate an image of a cat", "input": {"messages": [{"role": "user", "content": "Generate an image of a cat"}]}, "task": "image", "optimization": "price-optimized", "budget_dollars": 0.05,})
# Auto suggest (no inference)suggestions = client.auto_suggest({ "prompt": "Write a short poem", "task": "text", "limit": 5, "budget_dollars": 0.01,})
# Compare pricescomparison = client.price_comparison({ "task": "text", "optimization": "balanced",})
# List modelsmodels = client.list_models({"task_type": "text_generation"})
# Get model detailsmodel = client.get_model("meta", "llama-3.3-70b-instruct-fp8-fast")
# Preferencesprefs = client.get_preferences()client.set_preference("default_optimization", "price-optimized")suggestions = client.get_preference_suggestions()
# Model healthhealth = client.get_model_health(["openai/gpt-5", "anthropic/claude-sonnet-4"])For async usage, use AsyncGreatRouter with httpx:
from greatrouter import AsyncGreatRouter
async with AsyncGreatRouter("pk_live_...") as client: result = await client.auto_route({...})SDK Error Handling
Both SDKs raise typed errors:
| Error | When |
|---|---|
AuthenticationError | 401 — invalid or missing API key |
RateLimitError | 429 — rate limit exceeded |
ValidationError | 400 — invalid request body |
GreatRouterError | Any other non-2xx response |
Error codes
| Code | Meaning |
|---|---|
400 | Invalid request body or parameters |
401 | Authentication failed — check your API key |
404 | Unknown model or endpoint |
429 | Rate limit exceeded |
500 | Internal server error — try again later |
502 | Upstream provider error — model temporarily unavailable |