Provider Reasoning Behavior¶
The Reasoning / Thinking guide describes how LLM-Rosetta normalizes reasoning parameters through its IR layer. This page documents how each provider actually behaves — which effort values it accepts, how it handles disabled reasoning, what thinking_type semantics it enforces, and how reasoning metadata survives cross-provider round-trips.
All data below comes from empirical probing against live provider APIs, not documentation alone. See Probing methodology for details.
Shim configuration reference
Each provider's shim YAML encodes the behaviors documented here. See Provider Shims for the shim architecture and provider.yaml format.
Effort Value Acceptance¶
Which reasoning_effort / output_config.effort values each provider accepts. Values not listed are rejected with a 400 error unless otherwise noted.
Official Upstream APIs¶
Probed 2026-06-10 against official endpoints with real API keys (#185).
| Provider | Endpoint | none |
minimal |
low |
medium |
high |
xhigh |
max |
|---|---|---|---|---|---|---|---|---|
| OpenAI | Chat (reasoning_effort) |
— | — | ✅ | ✅ | ✅ | — | — |
| OpenAI | Responses (reasoning.effort) |
— | — | ✅ | ✅ | ✅ | — | — |
| Anthropic | Messages (output_config.effort) |
— | — | ✅ | ✅ | ✅ | ✅ | ✅ |
| MiniMax | Anthropic (output_config.effort) |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| MiniMax | Chat (reasoning_effort) |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| OpenRouter | Chat (reasoning_effort) |
✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| Volcengine | Chat (reasoning_effort) |
❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
| DeepSeek | Chat (reasoning_effort) |
❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ |
Note
Google GenAI does not use effort strings. It controls reasoning via thinking_config.thinking_budget (integer token count) and thinking_config.thinking_level ("minimal" / "low" / "medium" / "high").
Argo Gateway¶
Probed 2026-05-23 and 2026-06-09 (#220).
Argo proxies requests to upstream providers but applies its own validation. The Anthropic endpoint uses output_config.effort:
| Model | effort alone |
adaptive + effort |
effort=low |
|---|---|---|---|
| haiku45 / sonnet45 / opus41 / opus45 | ✅ | ❌ 400 | ✅ |
| sonnet46 / opus46 | ✅ | ✅ | ✅ |
| opus47 | ✅ | ✅ | ✅ |
Only sonnet46, opus46, and opus47 support thinking + output_config coexistence.
Disabled Behavior¶
How each provider disables reasoning. The shim disabled field controls which strategy LLM-Rosetta uses.
| Provider | Strategy | Shim disabled |
Behavior |
|---|---|---|---|
| OpenAI (Chat / Responses) | Omit thinking field | omit |
No thinking / reasoning object sent |
| Anthropic | Explicit disable | thinking_disabled |
thinking: { type: "disabled" } |
| Google GenAI | Zero budget | thinking_budget_zero |
thinking_config: { thinking_budget: 0 } |
| DeepSeek | Explicit disable | thinking_disabled |
thinking: { type: "disabled" } |
| Volcengine (Chat) | Explicit disable | thinking_disabled |
thinking: { type: "disabled" } |
| Volcengine (Responses) | Omit thinking field | omit |
No thinking / reasoning object sent |
| MiniMax (Anthropic) | Explicit disable | thinking_disabled |
thinking: { type: "disabled" } |
| MiniMax (Chat) | Explicit disable | thinking_disabled |
thinking: { type: "disabled" } |
| OpenRouter | Omit | omit |
No thinking object sent |
| Argo (Anthropic) | Explicit disable | thinking_disabled |
thinking: { type: "disabled" } |
| Argo (OpenAI Chat) | Omit | omit |
No thinking object sent |
Volcengine and DeepSeek reject none
Sending reasoning_effort: "none" to Volcengine or DeepSeek returns a 400 error. To disable reasoning on these providers, use thinking.type: "disabled" — which is what the shim's disabled: thinking_disabled strategy does automatically when the IR mode is disabled.
Thinking Type¶
Whether the provider uses thinking.type = "enabled" (requires budget_tokens) or "adaptive" (model decides). Some providers and models only accept one or the other.
Provider-Level Defaults¶
| Provider | thinking_type |
Notes |
|---|---|---|
| OpenAI | — | No thinking.type field; reasoning is implicit on reasoning models |
| Anthropic | (both) | Accepts enabled and adaptive; enabled requires budget_tokens |
| Google GenAI | — | Uses thinking_budget integer, not type strings |
| DeepSeek | — | Uses thinking.type but converter handles auto → adaptive mapping |
| Volcengine | enabled |
Rejects adaptive; shim forces thinking_type: enabled |
| MiniMax (Chat) | adaptive |
Rejects enabled; shim forces thinking_type: adaptive |
| Argo (Anthropic) | enabled |
Default; but see model-level overrides below |
Argo Anthropic Model-Level Overrides¶
Probed 2026-05-23 (#220). Argo's Anthropic endpoint has model-specific thinking behavior:
| Model | enabled (no budget) |
enabled + budget_tokens |
adaptive |
Notes |
|---|---|---|---|---|
| haiku45 / sonnet45 | ❌ budget required | ✅ | ❌ | enabled + budget only |
| opus41 / opus45 | ❌ budget required | ✅ | ❌ | enabled + budget only |
| sonnet46 / opus46 | ❌ budget required | ✅ | ✅ | Both modes accepted |
| opus47 | ❌ type not accepted | ❌ type not accepted | ✅ | adaptive only |
opus47 shim override
The argo--anthropic shim declares thinking_type: enabled at the provider level, with a model_overrides entry for claudeopus47: thinking_type: adaptive. See PR #256.
Automatic Fallback¶
When a shim declares thinking_type: enabled but the request has no budget_tokens (required by Anthropic for type: "enabled"), the converter automatically falls back to type: "adaptive". This prevents invalid payloads without requiring the client to know about provider constraints.
Provider Metadata Round-Trip¶
Reasoning blocks carry provider-specific metadata that must survive cross-provider conversions. LLM-Rosetta preserves these in the IR provider_metadata field on content parts.
| Metadata Field | Origin Provider | Carried On | Purpose |
|---|---|---|---|
signature |
Anthropic | ReasoningPart |
Cryptographic signature for thinking block replay |
thoughtSignature |
Google GenAI | provider_metadata.google |
Equivalent of Anthropic signature for Gemini 2.5+ |
encrypted_content |
Anthropic | provider_metadata.anthropic |
Encrypted reasoning content (opaque to converters) |
reasoning_details |
OpenAI | provider_metadata.openai |
Extended reasoning metadata |
Cross-Provider Signature Handling¶
When routing between providers (e.g. Anthropic client → Google backend), signatures from the source provider must be preserved so that:
- The source client can replay conversation history with valid signatures
- The target provider receives its own signature format (if applicable)
LLM-Rosetta achieves this by storing signatures in provider_metadata on each content part during IR conversion. The Anthropic converter serializes provider_metadata as _provider_metadata on thinking, tool_use, tool_result, and text blocks (PR #257, PR #263).
Unsigned Reasoning Blocks¶
Some clients send conversation history containing thinking blocks without valid signatures. This happens when:
- The client strips or never received signatures (e.g. Claude CLI with
signature: "") - The conversation crossed providers and the original signature is not applicable
The Problem¶
Providers that validate signatures (like Argo's Anthropic endpoint) reject these blocks:
messages.61.content.0.thinking.signature: Field required
messages.9.content.0: Invalid `signature` in `thinking` block
Shim Policy: unsigned_reasoning_blocks¶
ReasoningCapability supports an unsigned_reasoning_blocks field with two values:
| Value | Behavior |
|---|---|
as_is (default) |
Forward unsigned reasoning blocks as-is. The target provider decides whether to accept or reject them. |
preserve |
Remove unsigned reasoning blocks from outbound messages. The reasoning content is preserved in provider_metadata.anthropic.unsigned_reasoning_blocks on the IR part for downstream consumers. |
Currently only argo--anthropic uses preserve. Direct Anthropic API calls use as_is because the official API handles unsigned blocks differently from Argo's proxy layer.
When preserve filters all reasoning parts from an assistant message, the message is skipped entirely (with a warning) rather than sending an empty content: [] array.
See #268 and PR #269 for the implementation.
Shim Configuration Reference¶
How ReasoningCapability fields in provider.yaml map to the behaviors documented above.
reasoning:
disabled: thinking_disabled # "omit" | "thinking_disabled" | "thinking_budget_zero"
effort_field: output_config.effort # Where effort value is placed
thinking_type: enabled # Force thinking.type: "enabled" | "adaptive"
max_effort: high # Cap IR effort at this level
unsigned_reasoning_blocks: preserve # "as_is" | "preserve"
effort_map: # IR effort → provider effort string
minimal: low
low: low
medium: medium
high: high
xhigh: xhigh
max: max
model_overrides: # Per-model overrides (keyed by upstream model ID)
claudeopus47:
thinking_type: adaptive
Field Reference¶
| Field | Type | Default | Description |
|---|---|---|---|
disabled |
string | "omit" |
Strategy when IR mode is disabled. omit: don't send thinking field. thinking_disabled: send type: "disabled". thinking_budget_zero: send thinking_budget: 0. |
effort_field |
string | "reasoning_effort" |
Request field path for effort value. "none" to suppress effort emission. |
thinking_type |
string | (none) | Force outbound thinking.type. Applied after converter's own mapping. |
max_effort |
string | (none) | Highest IR effort level to emit. Higher levels are clamped to this value. |
unsigned_reasoning_blocks |
string | "as_is" |
Policy for outbound reasoning blocks without valid signatures. |
effort_map |
map | (identity) | Maps IR effort levels to provider-specific strings. Unmapped levels are dropped with a warning. |
model_overrides |
map | (none) | Per-model overrides for any of the above fields. Keyed by the upstream model ID used in the shim's model_id_field. |
Provider Shim Summary¶
| Provider | disabled |
effort_field |
thinking_type |
max_effort |
unsigned_reasoning_blocks |
|---|---|---|---|---|---|
| openai | omit |
reasoning_effort |
— | high |
as_is |
| openai_responses | omit |
reasoning.effort |
— | high |
as_is |
| anthropic | thinking_disabled |
output_config.effort |
— | — | as_is |
thinking_budget_zero |
none |
— | — | as_is |
|
| deepseek | thinking_disabled |
reasoning_effort |
— | — | as_is |
| volcengine (Chat) | thinking_disabled |
reasoning_effort |
enabled |
high |
as_is |
| volcengine (Responses) | omit |
reasoning.effort |
— | — | as_is |
| minimax (Anthropic) | thinking_disabled |
output_config.effort |
— | — | as_is |
| minimax (Chat) | thinking_disabled |
reasoning_effort |
adaptive |
— | as_is |
| openrouter | omit |
reasoning_effort |
— | xhigh |
as_is |
| argo (Anthropic) | thinking_disabled |
output_config.effort |
enabled |
— | preserve |
| argo (OpenAI Chat) | omit |
reasoning_effort |
— | — | as_is |
Probing Methodology¶
All behavior data on this page was obtained by sending real API requests to live provider endpoints and observing accept/reject responses. This approach catches undocumented constraints that provider documentation may not cover.
Sources¶
- Argo parameter probing — #220: 24 non-embedding models probed across both OpenAI Chat and Anthropic endpoints. Covered sampling, thinking modes, effort values, tool schemas, and field constraints.
- Official API effort probing — #185 comment: Effort value acceptance tested against OpenAI, Anthropic (via OpenRouter), DeepSeek, Volcengine, and MiniMax.
- Unsigned reasoning block discovery — #268: Dev-test replay of Claude CLI session revealed Argo Anthropic rejecting
thinkingblocks with empty signatures.
Discrepancies Found¶
Provider documentation does not always match actual API behavior. Notable examples from Argo probing:
| What docs say | What actually happens |
|---|---|
o-series models reject temperature / top_p |
Accepted and silently ignored |
gpt5 / gpt5mini / gpt5nano accept temperature / top_p |
Only accept temperature=1.0; any other value → 400 |
opus47 supports thinking.type: "enabled" |
Only accepts adaptive |
These discrepancies are why empirical probing is essential — and why shim configuration exists to encode the actual behavior rather than the documented behavior.