API Reference
Complete API reference for BrainstormRouter's OpenAI-compatible endpoints.
# API Reference
BrainstormRouter exposes an OpenAI-compatible API at https://api.brainstormrouter.com/v1. All endpoints accept and return JSON in the same format as the OpenAI API.
Authentication
All requests require a Bearer token in the Authorization header:
``
Authorization: Bearer br-your-api-key
`
Endpoints
POST /v1/chat/completions
Create a chat completion. Supports streaming.
`json
{
"model": "auto",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello"}
],
"temperature": 0.7,
"max_tokens": 4096,
"stream": true,
"tools": [],
"tool_choice": "auto"
}
`
Model values: Use auto for intelligent routing, or specify a model directly: claude-opus-4-6, claude-sonnet-4-6, gpt-5.4, gemini-3.1-pro, gemini-3.1-flash, deepseek-v3, kimi-k2.5, etc.
Response (non-streaming):
`json
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "claude-opus-4-6",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": "Hello! How can I help?"},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 8,
"total_tokens": 20
}
}
`
Streaming: Set "stream": true to receive Server-Sent Events. Each event contains a delta object with incremental content.
GET /v1/models
List all available models with metadata.
`bash
curl https://api.brainstormrouter.com/v1/models \
-H "Authorization: Bearer br-your-api-key"
`
Returns model IDs, providers, context window sizes, pricing, and capability scores.
POST /v1/embeddings
Generate embeddings using the best available embedding model.
`json
{
"model": "auto",
"input": "The quick brown fox jumps over the lazy dog"
}
`
GET /v1/intelligence/recommendations
Get model recommendations for a task description.
`bash
curl "https://api.brainstormrouter.com/v1/intelligence/recommendations?task=code+review" \
-H "Authorization: Bearer br-your-api-key"
`
GET /v1/intelligence/rankings
Current model rankings by task type, updated in real-time by Thompson sampling.
POST /v1/intelligence/cost-forecast
Forecast costs for a planned workload.
`json
{
"tasks": [
{"type": "code_generation", "estimated_tokens": 50000},
{"type": "review", "estimated_tokens": 20000}
],
"strategy": "combined"
}
`
GET /v1/intelligence/health
Provider health status with latency percentiles, error rates, and circuit breaker states.
GET /v1/intelligence/patterns
Community usage patterns aggregated across all users (anonymized). Shows which models are trending for specific task types.
Rate Limits
| Plan | Requests/min | Tokens/min |
|------|-------------|------------|
| Free | 20 | 100K |
| Pro | 200 | 2M |
| Team | 1000 | 10M |
| Enterprise | Custom | Custom |
Rate limit headers are included in every response: X-RateLimit-Remaining, X-RateLimit-Reset.
Error Handling
Errors follow the OpenAI error format:
`json
{
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_error",
"code": 429
}
}
``