Providers
Kyber supports multiple LLM providers through LiteLLM. You can pin a specific provider so it won’t fall back to another key when multiple are configured.
Supported providers
| Provider | Config key | Get an API key |
|---|---|---|
| OpenRouter | openrouter | openrouter.ai/keys |
| Anthropic | anthropic | console.anthropic.com |
| OpenAI | openai | platform.openai.com |
| Google Gemini | gemini | aistudio.google.com |
| DeepSeek | deepseek | platform.deepseek.com |
| Groq | groq | console.groq.com |
| Zhipu | zhipu | open.bigmodel.cn |
| vLLM / Custom | vllm | Self-hosted |
Configuration
Set your provider and API key in ~/.kyber/config.json:
{
"agents": {
"defaults": {
"provider": "openrouter",
"model": "google/gemini-3-flash-preview"
}
},
"providers": {
"openrouter": {
"apiKey": "sk-or-v1-your-key-here"
}
}
}Using OpenRouter (recommended)
OpenRouter gives you access to models from every major provider through a single API key. This is the default and easiest option.
{
"agents": {
"defaults": {
"provider": "openrouter",
"model": "anthropic/claude-sonnet-4-20250514"
}
},
"providers": {
"openrouter": { "apiKey": "sk-or-v1-xxx" }
}
}You can use any model available on OpenRouter by setting the model field to the OpenRouter model ID.
Using a provider directly
To use a provider’s API directly (lower latency, no middleman):
{
"agents": {
"defaults": {
"provider": "deepseek",
"model": "deepseek-chat"
}
},
"providers": {
"deepseek": { "apiKey": "sk-xxx" }
}
}Using a local model (vLLM)
If you’re running a local model via vLLM or any OpenAI-compatible endpoint:
{
"agents": {
"defaults": {
"provider": "vllm",
"model": "meta-llama/Llama-3-8b"
}
},
"providers": {
"vllm": {
"apiKey": "none",
"apiBase": "http://localhost:8000/v1"
}
}
}Provider fallback
When provider is set explicitly, Kyber only uses that provider’s API key. If you leave provider empty, Kyber checks keys in this order: OpenRouter → DeepSeek → Anthropic → OpenAI → Gemini → Zhipu → Groq → vLLM, and uses the first one it finds.
Retries
The provider handles automatic retries with exponential backoff for transient errors like rate limits, timeouts, and malformed upstream responses.