Provider Routing

The provider routing layer abstracts model selection behind a unified interface. Operators specify a model name; the router resolves the provider, authenticates, and routes the request -- all transparently.

Provider Discovery

On startup, Forge scans configuration files and environment variables to detect available providers. Each provider declares its capabilities: supported models, rate limits, and feature flags (tool use, vision, extended context).

Supported Providers

Provider	Routing	Best For
OpenCode Go	Direct API	Primary execution engine
Anthropic	Direct API	Claude models (Sonnet, Opus)
OpenAI	Compatibility shim	GPT models
Ollama Cloud	Compatibility shim	Hosted open-weight models
OpenRouter	Compatibility shim	Multi-provider aggregation
LM Studio	Compatibility shim	Fully local execution
Cloudflare AI	Compatibility shim	Edge-deployed models

OpenAI Compatibility Shim

Non-Anthropic providers use an OpenAI-compatible API shim. The shim translates between Anthropic's native message format (used internally by the agent) and OpenAI's chat completion format. This enables drop-in use of any provider that supports the OpenAI API contract without rewriting agent instructions.

Transformations handled by the shim:

Message role mapping (user / assistant to user / assistant)
Tool definition format conversion
Streaming response normalization
Token usage reporting

Credential Pooling

Multi-account support distributes load across API keys to avoid rate limits.

Strategy	Behavior
`fill_first`	Use first account until rate limited, then rotate
`round_robin`	Cycle through accounts on each request
`random`	Select account at random
`least_used`	Track token usage per account, use the lowest

Credentials are stored in environment variables with indexed suffixes (PROVIDER_API_KEY_0, PROVIDER_API_KEY_1). The pooler rotates through available keys based on the configured strategy.

Model Validation

Before routing a request, the system validates:

The requested model exists in the selected provider's catalog
The provider supports required features (tool use, vision, etc.)
The account has not exceeded its rate limit

If validation fails, the router falls back to the next available provider or notifies the operator.

Model Family Detection

The router detects model families to gate features. It identifies whether a model supports:

Tool calling / function calling
Image input (vision)
Extended context windows (>128K)
Parallel tool execution

Feature availability is surfaced to the agent so it can adapt its behavior -- for example, falling back to single-tool calls when the model does not support parallel execution.

Provider Routing ​

Provider Discovery ​

Supported Providers ​

OpenAI Compatibility Shim ​

Credential Pooling ​

Model Validation ​

Model Family Detection ​