API Reference

Este conteúdo não está disponível em sua língua ainda.

Endpoint

Auto Route

POST https://api.greatrouterai.com/v1/auto/route

Classifies your prompt, selects the optimal model, and proxies the request. This is the primary endpoint for most use cases.

Explicit Route

POST https://api.greatrouterai.com/v1/models/route

Routes to a model matching your explicit filters. Use this when you know the task type and want fine-grained control.

OpenAI-Compatible Chat Completions

POST https://api.greatrouterai.com/v1/chat/completions

Drop-in shim for OpenAI SDK clients. Set baseURL to https://api.greatrouterai.com/v1 and use model: "router" for auto-routing, or pass an explicit model id.

{
  "model": "router",
  "messages": [{ "role": "user", "content": "Write a haiku about routing" }]
}

Response follows OpenAI chat.completion shape with a greatrouter metadata object (classified task, routing reason, fallback chain, estimated cost). Set stream: true for token-by-token SSE streaming on supported models. /v1/auto/route returns complete responses only.

Optional headers:

X-GR-Cache-TTL — cache identical prompts (seconds, max 3600)
X-GR-Guardrails: moderate — block obvious PII patterns before inference

Auto Suggest

POST https://api.greatrouterai.com/v1/auto/suggest

Returns model suggestions without actually routing or proxying a request. Useful for showing users what model would be selected. Each suggestion includes estimated_cost_dollars.

Price Comparison

POST https://api.greatrouterai.com/v1/models/price-comparison

Compare model prices for a given prompt/task. Returns ranked suggestions with cost estimates and price comparison metadata.

Preferences

GET    https://api.greatrouterai.com/v1/preferences
PUT    https://api.greatrouterai.com/v1/preferences/:key
DELETE https://api.greatrouterai.com/v1/preferences/:key
GET    https://api.greatrouterai.com/v1/preferences/suggestions

Manage per-organization routing preferences (default optimization, excluded providers, preferred providers, etc.).

Model Health

GET https://api.greatrouterai.com/v1/model-health

Check model health status (degradation, latency penalties). Query param models accepts comma-separated model IDs.

Activity Logs Export

GET https://api.greatrouterai.com/v1/org/logs?format=csv&limit=1000
GET https://api.greatrouterai.com/v1/org/logs?format=json&limit=1000

Session cookie auth (dashboard) or Bearer API key. CSV export includes routing reason and fallback chain columns.

Polar Webhook

POST https://api.greatrouterai.com/v1/webhooks/polar

Credits wallet on checkout completion. Configure POLAR_WEBHOOK_SECRET and point Polar.sh to this URL.

Authentication

All protected requests require a Bearer API key in the Authorization header:

Authorization: Bearer pk_live_...

Test keys use the pk_test_ prefix (pk_test_{9 chars}_{32 hex chars}).

Auto Route Request

Field	Type	Default	Description
`prompt`	string	— (required)	Natural language description of what you want
`input`	object	— (required)	Model input (messages, parameters, etc.)
`task`	string	—	Hint: `text`, `image`, `video`, `music`, `speech`, `code`, `web_search`
`content_mode`	string	—	`generate`, `edit`, or `combine`
`optimization`	string	`"balanced"`	`price-optimized`, `output-optimized`, or `balanced`
`mode`	string	`"auto"`	Classification mode: `auto` or `regex`
`session_id`	string	auto	Session ID for context tracking
`budget_dollars`	number	—	Maximum estimated cost in USD; models exceeding this are filtered out
`taxonomy`	string	—	Catalog category (`translation`, `llm`, `image-generation`, …) — auto-pick best model in category
`provider`	string	—	Provider slug within taxonomy (`google`, `meta`, …)
`catalog_family`	string	—	Catalog model line id — auto-pick best routable variant
`model`	string	—	Explicit model id (`provider/model`) — skips ranking
`feedback`	object	—	Feedback object with `correct` (boolean) and `suggested_task` (string)

Auto Suggest Request

Field	Type	Default	Description
`prompt`	string	— (required)	Natural language description
`input`	object	—	Optional model input for multimodal profiling and schema validation
`task`	string	—	Task hint
`content_mode`	string	—	`generate`, `edit`, or `combine`
`optimization`	string	`"balanced"`	Optimization mode
`limit`	integer	`5`	Number of suggestions (1–20)
`session_id`	string	auto	Session ID for context tracking
`budget_dollars`	number	—	Maximum estimated cost in USD; models exceeding this are filtered out
`taxonomy`	string	—	Catalog category filter
`provider`	string	—	Provider filter within taxonomy
`catalog_family`	string	—	Catalog model line filter
`model`	string	—	Explicit model id

Price Comparison Request

Field	Type	Default	Description
`prompt`	string	—	Optional prompt for classification
`task`	string	`text`	Task type hint
`optimization`	string	`"balanced"`	Optimization mode
`budget_dollars`	number	—	Maximum budget in USD

Explicit Route Request

Field	Type	Default	Description
`query`	string	— (required)	Task type or description to match
`input`	object	— (required)	Model input (messages, parameters, etc.)
`maxCost`	string	—	Price tier: `economy`, `standard`, `balanced`, `premium`, `flagship`
`filter.task_type`	string	—	Registry task type (e.g. `"text_generation"`)
`filter.capabilities`	array	—	Required capabilities (e.g. `["vision", "reasoning"]`)

Preferences

GET /v1/preferences

Returns the routing preferences for the authenticated organization. On database errors, returns safe defaults (balanced optimization, empty exclusion lists) instead of failing the request.

PUT /v1/preferences/:key

Sets a routing preference. Keys: default_optimization, default_task, excluded_providers, excluded_models, preferred_providers, content_mode_default.

DELETE /v1/preferences/:key

Deletes a routing preference.

GET /v1/preferences/suggestions

Returns AI-suggested preferences based on routing feedback and usage patterns.

Model Health

GET /v1/model-health?models=model1,model2

Returns health status for specified models (or all if models param omitted). Includes degradation status and latency penalties.

Response

Auto Route Response

{
  "classification": {
    "task": "text",
    "confidence": 0.95,
    "reasoning": "Prompt is a general text question",
    "decision": "regex",
    "content_mode": "generate",
    "complexity_tier": "SIMPLE",
    "complexity_score": 0.18
  },
  "model": {
    "model": {
      "id": "meta/llama-3.3-70b-instruct-fp8-fast",
      "name": "LLaMA 3.3 70B Instruct",
      "provider": "meta",
      "task_type": "text_generation",
      "category": "llm",
      "capabilities": ["text_generation", "coding"],
      "pricing": { "input_per_million_tokens": 0.10, "output_per_million_tokens": 0.10 }
    },
    "cost_tier": "economy",
    "estimated_cost": 0.0001
  },
  "price_comparison": {
    "cheapest_model": "meta/llama-3.3-70b-instruct-fp8-fast",
    "best_value_model": "google/gemini-2.5-flash",
    "best_quality_model": "anthropic/claude-sonnet-4",
    "estimated_cost_dollars": 0.0001,
    "savings_vs_highest_pct": 85
  },
  "result": {
    "choices": [
      {
        "message": {
          "role": "assistant",
          "content": "The capital of France is Paris."
        }
      }
    ],
    "usage": {
      "prompt_tokens": 14,
      "completion_tokens": 8,
      "total_tokens": 22
    }
  },
  "timing": {
    "classification_ms": 12,
    "routing_ms": 3,
    "inference_ms": 245,
    "total_ms": 260
  },
  "context": {
    "session_prompts": 1,
    "contextual_boost": 0.15
  }
}

Auto Suggest Response

{
  "classification": {
    "task": "image",
    "confidence": 0.92,
    "reasoning": "Prompt requests image generation",
    "decision": "regex",
    "content_mode": "generate",
    "complexity_tier": "MEDIUM",
    "complexity_score": 0.42
  },
  "optimization": "balanced",
  "suggestions": [
    {
      "model": { "id": "...", "name": "...", "provider": "...", "task_type": "..." },
      "cost_tier": "economy",
      "estimated_cost": 0.02,
      "estimated_cost_dollars": 0.02,
      "score": 0.85
    }
  ],
  "price_comparison": {
    "cheapest_model": "...",
    "best_value_model": "...",
    "best_quality_model": "...",
    "savings_vs_highest_pct": 75
  }
}

Price Comparison Response

{
  "classification": { "task": "text", "confidence": 0.9, "reasoning": "..." },
  "optimization": "balanced",
  "suggestions": [
    {
      "model_id": "meta/llama-3.3-70b-instruct-fp8-fast",
      "name": "LLaMA 3.3 70B Instruct",
      "provider": "meta",
      "task_type": "text_generation",
      "cost_tier": "economy",
      "score": 0.85,
      "estimated_cost_dollars": 0.0001
    }
  ],
  "price_comparison": {
    "cheapest_model": "...",
    "best_value_model": "...",
    "best_quality_model": "...",
    "savings_vs_highest_pct": 85
  }
}

Model Health Response

{
  "health": [
    {
      "model_id": "openai/gpt-5",
      "degraded": false,
      "latency_penalty": 0,
      "health_entry": { "failures": 0, "successes": 15, "consecutiveFailures": 0, "avgLatencyMs": 320 }
    }
  ],
  "tenant_id": "org_abc123"
}

Explicit Route Response

{
  "model": {
    "id": "meta/llama-3.3-70b-instruct-fp8-fast",
    "name": "LLaMA 3.3 70B Instruct",
    "provider": "meta",
    "task_type": "text_generation",
    "category": "llm",
    "capabilities": ["text_generation", "coding"],
    "pricing": { "input_per_million_tokens": 0.10, "output_per_million_tokens": 0.10 }
  },
  "cost_tier": "economy",
  "estimated_cost": 0.0001,
  "result": { ... },
  "timing": {
    "routing_ms": 2,
    "inference_ms": 198,
    "total_ms": 200
  }
}

List models

GET https://api.greatrouterai.com/v1/models

Query parameters:

Parameter	Description
`task_type`	Filter by task type (e.g. `text_generation`)
`provider`	Filter by provider (e.g. `openai`)
`category`	Filter by category (e.g. `llm`, `image`)
`capabilities`	Comma-separated capabilities (e.g. `vision,reasoning`)
`type`	Filter by type: `hosted` or `proxied`
`catalog_family`	Filter by catalog model line id
`catalog_visibility`	Filter by catalog visibility tier
`routable_only`	`true` to exclude long-tail models

Returns all matching models with metadata.

Model query

POST https://api.greatrouterai.com/v1/models/query

Paginated model search. POST body accepts list filters plus:

Field	Type	Description
`price_tier`	string	`economy`, `standard`, `balanced`, `premium`, `flagship`
`tags`	string[]	Match models with any of these tags
`text`	string	Full-text search across name, description, provider
`catalog_family`	string	Filter by catalog model line id
`catalog_visibility`	string	Filter by visibility (`featured`, `standard`, `long_tail`, …)
`routable_only`	boolean	When `true`, excludes long-tail catalog models
`limit`	integer	Page size (max 500, default 100)
`offset`	integer	Pagination offset

Catalog browse

Public catalog taxonomy endpoints for browsing the model catalog by model line:

GET https://api.greatrouterai.com/v1/catalog/taxonomy
GET https://api.greatrouterai.com/v1/catalog/families
GET https://api.greatrouterai.com/v1/catalog/families/:id

GET /v1/catalog/families query parameters:

Parameter	Description
`taxonomy`	Filter by taxonomy branch
`offset`	Pagination offset (default 0)
`limit`	Page size (default 24, max 100)
`routable_only`	`true` to hide long-tail model lines
`q`	Text search across model line names

Get a specific model

GET https://api.greatrouterai.com/v1/models/:provider/:model

Returns full model metadata. Model IDs use the format provider/model-name.

When available, the response includes model intelligence fields:

Field	Description
`provenance`	Creator, license, and source metadata from the catalog intelligence index
`benchmarks`	Published benchmark scores (when indexed)
`performance_timeseries`	Rolling latency/success windows from production routing (requires DB; omitted when unavailable)
`catalog_family`	Model line id for hierarchical routing (`taxonomy/provider--family`)
`catalog_visibility`	`featured`, `standard`, or `long_tail`
`price_tier`	`economy`, `standard`, `balanced`, `premium`, or `flagship`

POST https://api.greatrouterai.com/v1/models/recommend

Semantic or task-based model recommendations. Accepts router tasks (text, image, …) via task (not only registry task_type strings). When prompt is provided, uses semantic search over model descriptions.

Field	Type	Description
`task`	string	Router task hint (`text`, `image`, `video`, …)
`prompt`	string	Optional prompt for semantic ranking
`limit`	integer	Max results (default 5)
`optimization`	string	`price-optimized`, `balanced`, or `output-optimized`

Other model endpoints

Endpoint	Description
`GET /v1/models/:provider/:model/input-schema`	Get the input schema for a model
`GET /v1/models/:provider/:model/output-schema`	Get the output schema for a model
`POST /v1/models/:provider/:model/validate-input`	Validate input against a model’s schema
`POST /v1/models/:provider/:model/validate-output`	Validate output against a model’s schema
`GET /v1/models/tags`	List all available tags
`GET /v1/models/price-tiers`	List all price tiers

Feedback

POST https://api.greatrouterai.com/v1/auto/feedback

Submit feedback on routing accuracy to improve the classifier.

Field	Type	Description
`prompt`	string	The original prompt
`classified_task`	string	The task the router classified
`correct`	boolean	Whether the classification was correct
`confidence`	number	Confidence score (0–1)
`suggested_task`	string	(optional) What the correct task should have been

Money conventions

All API-facing fields use USD dollars (e.g. budget_dollars, estimated_cost_dollars). Internal services and database columns use cents (e.g. budget_cents, estimated_cost_cents, cost_cents). The API layer converts between the two representations automatically.

SDKs

GreatRouter provides official SDKs for TypeScript and Python.

TypeScript

npm install @greatrouter/sdk

import { GreatRouter } from '@greatrouter/sdk'

const client = new GreatRouter('pk_live_...')

// Auto route
const result = await client.autoRoute({
  prompt: 'Generate an image of a cat',
  input: { messages: [{ role: 'user', content: 'Generate an image of a cat' }] },
  task: 'image',
  optimization: 'price-optimized',
  budget_dollars: 0.05,
})

// Auto suggest (no inference)
const suggestions = await client.autoSuggest({
  prompt: 'Write a short poem',
  task: 'text',
  limit: 5,
  budget_dollars: 0.01,
})

// Compare prices
const comparison = await client.priceComparison({
  task: 'text',
  optimization: 'balanced',
})

// List models
const models = await client.listModels({ task_type: 'text_generation' })

// Get model details
const model = await client.getModel('meta', 'llama-3.3-70b-instruct-fp8-fast')

// Preferences
const prefs = await client.getPreferences()
await client.setPreference({ key: 'default_optimization', value: 'price-optimized' })
const suggestions = await client.getPreferenceSuggestions()

// Model health
const health = await client.getModelHealth(['openai/gpt-5', 'anthropic/claude-sonnet-4'])

Python

pip install greatrouter

from greatrouter import GreatRouter

client = GreatRouter("pk_live_...")

# Auto route
result = client.auto_route({
    "prompt": "Generate an image of a cat",
    "input": {"messages": [{"role": "user", "content": "Generate an image of a cat"}]},
    "task": "image",
    "optimization": "price-optimized",
    "budget_dollars": 0.05,
})

# Auto suggest (no inference)
suggestions = client.auto_suggest({
    "prompt": "Write a short poem",
    "task": "text",
    "limit": 5,
    "budget_dollars": 0.01,
})

# Compare prices
comparison = client.price_comparison({
    "task": "text",
    "optimization": "balanced",
})

# List models
models = client.list_models({"task_type": "text_generation"})

# Get model details
model = client.get_model("meta", "llama-3.3-70b-instruct-fp8-fast")

# Preferences
prefs = client.get_preferences()
client.set_preference("default_optimization", "price-optimized")
suggestions = client.get_preference_suggestions()

# Model health
health = client.get_model_health(["openai/gpt-5", "anthropic/claude-sonnet-4"])

For async usage, use AsyncGreatRouter with httpx:

from greatrouter import AsyncGreatRouter

async with AsyncGreatRouter("pk_live_...") as client:
    result = await client.auto_route({...})

SDK Error Handling

Both SDKs raise typed errors:

Error	When
`AuthenticationError`	401 — invalid or missing API key
`RateLimitError`	429 — rate limit exceeded
`ValidationError`	400 — invalid request body
`GreatRouterError`	Any other non-2xx response

Error codes

Code	Meaning
`400`	Invalid request body or parameters
`401`	Authentication failed — check your API key
`404`	Unknown model or endpoint
`429`	Rate limit exceeded
`500`	Internal server error — try again later
`502`	Upstream provider error — model temporarily unavailable

API Reference

Endpoint

Auto Route

Explicit Route

OpenAI-Compatible Chat Completions

Auto Suggest

Price Comparison

Preferences

Model Health

Activity Logs Export

Polar Webhook

Authentication

Auto Route Request

Auto Suggest Request

Price Comparison Request

Explicit Route Request

Preferences

GET /v1/preferences

PUT /v1/preferences/:key

DELETE /v1/preferences/:key

GET /v1/preferences/suggestions

Model Health

GET /v1/model-health?models=model1,model2

Response

Auto Route Response

Auto Suggest Response

Price Comparison Response

Model Health Response

Explicit Route Response

List models

Model query

Catalog browse

Get a specific model

Recommend models

Other model endpoints

Feedback

Money conventions

SDKs

TypeScript

Python

SDK Error Handling

Error codes