Provider Reasoning Behavior¶

The Reasoning / Thinking guide describes how LLM-Rosetta normalizes reasoning parameters through its IR layer. This page documents how each provider actually behaves — which effort values it accepts, how it handles disabled reasoning, what thinking_type semantics it enforces, and how reasoning metadata survives cross-provider round-trips.

All data below comes from empirical probing against live provider APIs, not documentation alone. See Probing methodology for details.

Shim configuration reference

Each provider's shim YAML encodes the behaviors documented here. See Provider Shims for the shim architecture and provider.yaml format.

Effort Value Acceptance¶

Which reasoning_effort / output_config.effort values each provider accepts. Values not listed are rejected with a 400 error unless otherwise noted.

Official Upstream APIs¶

Probed 2026-06-10 against official endpoints with real API keys (#185).

Provider	Endpoint	`none`	`minimal`	`low`	`medium`	`high`	`xhigh`	`max`
OpenAI	Chat (`reasoning_effort`)	—	—	✅	✅	✅	—	—
OpenAI	Responses (`reasoning.effort`)	—	—	✅	✅	✅	—	—
Anthropic	Messages (`output_config.effort`)	—	—	✅	✅	✅	✅	✅
MiniMax	Anthropic (`output_config.effort`)	✅	✅	✅	✅	✅	✅	✅
MiniMax	Chat (`reasoning_effort`)	✅	✅	✅	✅	✅	✅	✅
OpenRouter	Chat (`reasoning_effort`)	✅	✅	✅	✅	✅	✅	❌
Volcengine	Chat (`reasoning_effort`)	❌	✅	✅	✅	✅	❌	❌
DeepSeek	Chat (`reasoning_effort`)	❌	❌	✅	✅	✅	✅	✅

Note

Google GenAI does not use effort strings. It controls reasoning via thinking_config.thinking_budget (integer token count) and thinking_config.thinking_level ("minimal" / "low" / "medium" / "high").

Argo Gateway¶

Probed 2026-05-23 and 2026-06-09 (#220).

Argo proxies requests to upstream providers but applies its own validation. The Anthropic endpoint uses output_config.effort:

Model	`effort` alone	`adaptive` + `effort`	`effort=low`
haiku45 / sonnet45 / opus41 / opus45	✅	❌ 400	✅
sonnet46 / opus46	✅	✅	✅
opus47	✅	✅	✅

Only sonnet46, opus46, and opus47 support thinking + output_config coexistence.

Disabled Behavior¶

How each provider disables reasoning. The shim disabled field controls which strategy LLM-Rosetta uses.

Provider	Strategy	Shim `disabled`	Behavior
OpenAI (Chat / Responses)	Omit thinking field	`omit`	No `thinking` / `reasoning` object sent
Anthropic	Explicit disable	`thinking_disabled`	`thinking: { type: "disabled" }`
Google GenAI	Zero budget	`thinking_budget_zero`	`thinking_config: { thinking_budget: 0 }`
DeepSeek	Explicit disable	`thinking_disabled`	`thinking: { type: "disabled" }`
Volcengine (Chat)	Explicit disable	`thinking_disabled`	`thinking: { type: "disabled" }`
Volcengine (Responses)	Omit thinking field	`omit`	No `thinking` / `reasoning` object sent
MiniMax (Anthropic)	Explicit disable	`thinking_disabled`	`thinking: { type: "disabled" }`
MiniMax (Chat)	Explicit disable	`thinking_disabled`	`thinking: { type: "disabled" }`
OpenRouter	Omit	`omit`	No `thinking` object sent
Argo (Anthropic)	Explicit disable	`thinking_disabled`	`thinking: { type: "disabled" }`
Argo (OpenAI Chat)	Omit	`omit`	No `thinking` object sent

Volcengine and DeepSeek reject none

Sending reasoning_effort: "none" to Volcengine or DeepSeek returns a 400 error. To disable reasoning on these providers, use thinking.type: "disabled" — which is what the shim's disabled: thinking_disabled strategy does automatically when the IR mode is disabled.

Thinking Type¶

Whether the provider uses thinking.type = "enabled" (requires budget_tokens) or "adaptive" (model decides). Some providers and models only accept one or the other.

Provider-Level Defaults¶

Provider	`thinking_type`	Notes
OpenAI	—	No `thinking.type` field; reasoning is implicit on reasoning models
Anthropic	(both)	Accepts `enabled` and `adaptive`; `enabled` requires `budget_tokens`
Google GenAI	—	Uses `thinking_budget` integer, not type strings
DeepSeek	—	Uses `thinking.type` but converter handles `auto → adaptive` mapping
Volcengine	`enabled`	Rejects `adaptive`; shim forces `thinking_type: enabled`
MiniMax (Chat)	`adaptive`	Rejects `enabled`; shim forces `thinking_type: adaptive`
Argo (Anthropic)	`enabled`	Default; but see model-level overrides below

Argo Anthropic Model-Level Overrides¶

Probed 2026-05-23 (#220). Argo's Anthropic endpoint has model-specific thinking behavior:

Model	`enabled` (no budget)	`enabled` + `budget_tokens`	`adaptive`	Notes
haiku45 / sonnet45	❌ budget required	✅	❌	`enabled` + budget only
opus41 / opus45	❌ budget required	✅	❌	`enabled` + budget only
sonnet46 / opus46	❌ budget required	✅	✅	Both modes accepted
opus47	❌ type not accepted	❌ type not accepted	✅	`adaptive` only

opus47 shim override

The argo--anthropic shim declares thinking_type: enabled at the provider level, with a model_overrides entry for claudeopus47: thinking_type: adaptive. See PR #256.

Automatic Fallback¶

When a shim declares thinking_type: enabled but the request has no budget_tokens (required by Anthropic for type: "enabled"), the converter automatically falls back to type: "adaptive". This prevents invalid payloads without requiring the client to know about provider constraints.

Provider Metadata Round-Trip¶

Reasoning blocks carry provider-specific metadata that must survive cross-provider conversions. LLM-Rosetta preserves these in the IR provider_metadata field on content parts.

Metadata Field	Origin Provider	Carried On	Purpose
`signature`	Anthropic	`ReasoningPart`	Cryptographic signature for thinking block replay
`thoughtSignature`	Google GenAI	`provider_metadata.google`	Equivalent of Anthropic signature for Gemini 2.5+
`encrypted_content`	Anthropic	`provider_metadata.anthropic`	Encrypted reasoning content (opaque to converters)
`reasoning_details`	OpenAI	`provider_metadata.openai`	Extended reasoning metadata

Cross-Provider Signature Handling¶

When routing between providers (e.g. Anthropic client → Google backend), signatures from the source provider must be preserved so that:

The source client can replay conversation history with valid signatures
The target provider receives its own signature format (if applicable)

LLM-Rosetta achieves this by storing signatures in provider_metadata on each content part during IR conversion. The Anthropic converter serializes provider_metadata as _provider_metadata on thinking, tool_use, tool_result, and text blocks (PR #257, PR #263).

Unsigned Reasoning Blocks¶

Some clients send conversation history containing thinking blocks without valid signatures. This happens when:

The client strips or never received signatures (e.g. Claude CLI with signature: "")
The conversation crossed providers and the original signature is not applicable

The Problem¶

Providers that validate signatures (like Argo's Anthropic endpoint) reject these blocks:

messages.61.content.0.thinking.signature: Field required
messages.9.content.0: Invalid `signature` in `thinking` block

Shim Policy: `unsigned_reasoning_blocks`¶

ReasoningCapability supports an unsigned_reasoning_blocks field with two values:

Value	Behavior
`as_is` (default)	Forward unsigned reasoning blocks as-is. The target provider decides whether to accept or reject them.
`preserve`	Remove unsigned reasoning blocks from outbound messages. The reasoning content is preserved in `provider_metadata.anthropic.unsigned_reasoning_blocks` on the IR part for downstream consumers.

Currently only argo--anthropic uses preserve. Direct Anthropic API calls use as_is because the official API handles unsigned blocks differently from Argo's proxy layer.

When preserve filters all reasoning parts from an assistant message, the message is skipped entirely (with a warning) rather than sending an empty content: [] array.

See #268 and PR #269 for the implementation.

Shim Configuration Reference¶

How ReasoningCapability fields in provider.yaml map to the behaviors documented above.

reasoning:
  disabled: thinking_disabled  # "omit" | "thinking_disabled" | "thinking_budget_zero"
  effort_field: output_config.effort  # Where effort value is placed
  thinking_type: enabled       # Force thinking.type: "enabled" | "adaptive"
  max_effort: high             # Cap IR effort at this level
  unsigned_reasoning_blocks: preserve  # "as_is" | "preserve"
  effort_map:                  # IR effort → provider effort string
    minimal: low
    low: low
    medium: medium
    high: high
    xhigh: xhigh
    max: max
  model_overrides:             # Per-model overrides (keyed by upstream model ID)
    claudeopus47:
      thinking_type: adaptive

Field Reference¶

Field	Type	Default	Description
`disabled`	string	`"omit"`	Strategy when IR mode is `disabled`. `omit`: don't send thinking field. `thinking_disabled`: send `type: "disabled"`. `thinking_budget_zero`: send `thinking_budget: 0`.
`effort_field`	string	`"reasoning_effort"`	Request field path for effort value. `"none"` to suppress effort emission.
`thinking_type`	string	(none)	Force outbound `thinking.type`. Applied after converter's own mapping.
`max_effort`	string	(none)	Highest IR effort level to emit. Higher levels are clamped to this value.
`unsigned_reasoning_blocks`	string	`"as_is"`	Policy for outbound reasoning blocks without valid signatures.
`effort_map`	map	(identity)	Maps IR effort levels to provider-specific strings. Unmapped levels are dropped with a warning.
`model_overrides`	map	(none)	Per-model overrides for any of the above fields. Keyed by the upstream model ID used in the shim's `model_id_field`.

Provider Shim Summary¶

Provider	`disabled`	`effort_field`	`thinking_type`	`max_effort`	`unsigned_reasoning_blocks`
openai	`omit`	`reasoning_effort`	—	`high`	`as_is`
openai_responses	`omit`	`reasoning.effort`	—	`high`	`as_is`
anthropic	`thinking_disabled`	`output_config.effort`	—	—	`as_is`
google	`thinking_budget_zero`	`none`	—	—	`as_is`
deepseek	`thinking_disabled`	`reasoning_effort`	—	—	`as_is`
volcengine (Chat)	`thinking_disabled`	`reasoning_effort`	`enabled`	`high`	`as_is`
volcengine (Responses)	`omit`	`reasoning.effort`	—	—	`as_is`
minimax (Anthropic)	`thinking_disabled`	`output_config.effort`	—	—	`as_is`
minimax (Chat)	`thinking_disabled`	`reasoning_effort`	`adaptive`	—	`as_is`
openrouter	`omit`	`reasoning_effort`	—	`xhigh`	`as_is`
argo (Anthropic)	`thinking_disabled`	`output_config.effort`	`enabled`	—	`preserve`
argo (OpenAI Chat)	`omit`	`reasoning_effort`	—	—	`as_is`

Probing Methodology¶

All behavior data on this page was obtained by sending real API requests to live provider endpoints and observing accept/reject responses. This approach catches undocumented constraints that provider documentation may not cover.

Sources¶

Argo parameter probing — #220: 24 non-embedding models probed across both OpenAI Chat and Anthropic endpoints. Covered sampling, thinking modes, effort values, tool schemas, and field constraints.
Official API effort probing — #185 comment: Effort value acceptance tested against OpenAI, Anthropic (via OpenRouter), DeepSeek, Volcengine, and MiniMax.
Unsigned reasoning block discovery — #268: Dev-test replay of Claude CLI session revealed Argo Anthropic rejecting thinking blocks with empty signatures.

Discrepancies Found¶

Provider documentation does not always match actual API behavior. Notable examples from Argo probing:

What docs say	What actually happens
o-series models reject `temperature` / `top_p`	Accepted and silently ignored
gpt5 / gpt5mini / gpt5nano accept `temperature` / `top_p`	Only accept `temperature=1.0`; any other value → 400
opus47 supports `thinking.type: "enabled"`	Only accepts `adaptive`

These discrepancies are why empirical probing is essential — and why shim configuration exists to encode the actual behavior rather than the documented behavior.