Skip to content

Provider Routing

The provider routing layer abstracts model selection behind a unified interface. Operators specify a model name; the router resolves the provider, authenticates, and routes the request -- all transparently.

Provider Discovery

On startup, Forge scans configuration files and environment variables to detect available providers. Each provider declares its capabilities: supported models, rate limits, and feature flags (tool use, vision, extended context).

Supported Providers

ProviderRoutingBest For
OpenCode GoDirect APIPrimary execution engine
AnthropicDirect APIClaude models (Sonnet, Opus)
OpenAICompatibility shimGPT models
Ollama CloudCompatibility shimHosted open-weight models
OpenRouterCompatibility shimMulti-provider aggregation
LM StudioCompatibility shimFully local execution
Cloudflare AICompatibility shimEdge-deployed models

OpenAI Compatibility Shim

Non-Anthropic providers use an OpenAI-compatible API shim. The shim translates between Anthropic's native message format (used internally by the agent) and OpenAI's chat completion format. This enables drop-in use of any provider that supports the OpenAI API contract without rewriting agent instructions.

Transformations handled by the shim:

  • Message role mapping (user / assistant to user / assistant)
  • Tool definition format conversion
  • Streaming response normalization
  • Token usage reporting

Credential Pooling

Multi-account support distributes load across API keys to avoid rate limits.

StrategyBehavior
fill_firstUse first account until rate limited, then rotate
round_robinCycle through accounts on each request
randomSelect account at random
least_usedTrack token usage per account, use the lowest

Credentials are stored in environment variables with indexed suffixes (PROVIDER_API_KEY_0, PROVIDER_API_KEY_1). The pooler rotates through available keys based on the configured strategy.

Model Validation

Before routing a request, the system validates:

  1. The requested model exists in the selected provider's catalog
  2. The provider supports required features (tool use, vision, etc.)
  3. The account has not exceeded its rate limit

If validation fails, the router falls back to the next available provider or notifies the operator.

Model Family Detection

The router detects model families to gate features. It identifies whether a model supports:

  • Tool calling / function calling
  • Image input (vision)
  • Extended context windows (>128K)
  • Parallel tool execution

Feature availability is surfaced to the agent so it can adapt its behavior -- for example, falling back to single-tool calls when the model does not support parallel execution.

Released under the MIT License.