跳转至

IR 类型 API

llm_rosetta.types.ir

LLM-Rosetta - IR (Intermediate Representation) Types

统一的IR类型导出入口 Unified IR types export entry point

这个模块重新组织了IR类型定义: - parts.py: 内容部分类型(ContentPart及其子类型) - messages.py: 消息类型(独立角色的TypedDict) - tools.py: 工具相关类型(工具定义、选择、配置) - generation.py: 生成控制配置类型(温度、top_p等生成参数) - request.py: 请求参数类型(基于SDK body structures) - response.py: 响应类型(扩展项和响应统计) - helpers.py: 辅助函数(内容提取、消息创建等)

This module reorganizes IR type definitions: - parts.py: Content part types (ContentPart and its subtypes) - messages.py: Message types (independent role TypedDicts) - tools.py: Tool-related types (tool definition, choice, configuration) - generation.py: Generation control configuration types (temperature, top_p, etc.) - request.py: Request parameter types (based on SDK body structures) - response.py: Response types (extension items and response statistics) - helpers.py: Helper functions (content extraction, message creation, etc.)

AssistantMessage

Bases: TypedDict

助手消息类型,用于AI助手的响应。 Assistant message type for AI assistant responses.

role字段预填写为"assistant",使用时不需要手动指定角色。 The role field is pre-filled as "assistant", no need to manually specify the role when using.

内容限制:文本、工具调用、推理、引用,未来支持音频、图像。 Content restrictions: Text, tool calls, reasoning, citations, future support for audio and images.

AudioData

Bases: TypedDict

Base64编码的音频数据 Base64 encoded audio data

AudioPart

Bases: TypedDict

音频内容(如OpenAI的audio响应)。 Audio content (e.g. OpenAI's audio response).

用于音频输出模态数据。 Used for audio output modality data.

BaseMessage

Bases: TypedDict

基础消息类型,所有角色消息的共同基础。 Base message type, common foundation for all role messages.

BatchMarker

Bases: TypedDict

批次标记,用于标记一组相关的操作。 Batch marker, used to mark a group of related operations.

Examples:

  • 并行工具调用的开始/结束 Start/end of parallel tool calls
  • 部分结果的进度跟踪 Progress tracking of partial results

CacheConfig

Bases: TypedDict

缓存配置 Cache configuration (OpenAI)

用于提示缓存(Prompt Caching)功能。

ChoiceInfo

Bases: TypedDict

选择结果信息(对应OpenAI的Choice)。 Choice result information (corresponds to OpenAI's Choice).

用于存储单个选择的结果,包含消息、停止原因、logprobs等。 Used to store the result of a single choice, including message, stop reason, logprobs, etc.

CitationPart

Bases: TypedDict

引用/注释内容(如OpenAI的annotations、Anthropic的citations)。 Citation/annotation content (e.g. OpenAI's annotations, Anthropic's citations).

用于标注信息来源,如网络搜索结果、文档引用等。 Used to mark information sources, such as web search results, document citations, etc.

ContentBlockEndEvent

Bases: TypedDict

Emitted when a content block ends.

Signals that no more deltas will arrive for the given block_index.

ContentBlockStartEvent

Bases: TypedDict

Emitted when a new content block begins.

Content blocks group related deltas (e.g., a text block, a thinking block, or a tool_use block). The block_type indicates what kind of deltas to expect.

FileData

Bases: TypedDict

Base64编码的文件数据 Base64 encoded file data

FilePart

Bases: TypedDict

文件内容,支持多种文件类型。 File content, supports multiple file types.

Examples:

  • PDF文档 PDF document
  • 音频文件 Audio file
  • 视频文件 Video file

FinishEvent

Bases: TypedDict

Finish event.

Emitted when the model finishes generating for a choice.

FinishReason

Bases: TypedDict

停止原因信息。 Stop reason information.

来自各SDK的finish_reason/stop_reason字段,说明模型停止生成的原因。 From each SDK's finish_reason/stop_reason field, explaining why the model stopped generating.

GenerationConfig

Bases: TypedDict

生成控制参数 Generation control parameters

统一了各provider的生成控制参数,映射关系: Unified generation control parameters across providers, mapping:

  • temperature: 所有provider都支持 All providers support
  • top_p: 所有provider都支持 All providers support
  • top_k: Anthropic, Google支持 Anthropic, Google support
  • max_tokens:
    • OpenAI Chat: max_completion_tokens
    • OpenAI Responses: max_output_tokens
    • Anthropic: max_tokens (必需 required)
    • Google: config.max_output_tokens
  • frequency_penalty: OpenAI, Google支持 OpenAI, Google support
  • presence_penalty: OpenAI, Google支持 OpenAI, Google support
  • seed: OpenAI, Google支持 OpenAI, Google support
  • logprobs: 各provider实现不同 Different implementations across providers

IRRequest

Bases: TypedDict

统一的IR请求类型 Unified IR request type

这个类型整合了所有provider的核心请求参数,提供统一的接口。

必需字段 Required fields: - model: 模型ID - messages: 消息列表

可选字段按功能分组: - 系统指令: system_instruction - 工具相关: tools, tool_choice, tool_config - 生成控制: generation, response_format - 流式输出: stream - 推理配置: reasoning - 缓存配置: cache

IRResponse

Bases: TypedDict

统一的IR响应类型。 Unified IR response type.

包含响应的所有信息:ID、时间戳、模型、选择列表、使用统计等。 Contains all information of the response: ID, timestamp, model, choices list, usage statistics, etc.

ImageData

Bases: TypedDict

Base64编码的图像数据 Base64 encoded image data

ImagePart

Bases: TypedDict

图像内容,支持URL或base64 Image content, supports URL or base64

LegacyMessage

Bases: TypedDict

传统的消息类型定义,为了向后兼容。 Legacy message type definition for backward compatibility.

推荐使用具体的角色消息类型(SystemMessage, UserMessage等)。 Recommend using specific role message types (SystemMessage, UserMessage, etc.).

MessageMetadata

Bases: TypedDict

消息的元数据,用于存储额外信息 Metadata of the message, used to store extra information

ReasoningConfig

Bases: TypedDict

Reasoning/thinking configuration.

Controls whether and how the model performs explicit reasoning.

Provider mappings for mode: - "auto": Model decides when/how much to think. Anthropic: thinking.type="adaptive", Google: thinking_budget=-1 - "enabled": Explicit thinking with budget control. Anthropic: thinking.type="enabled" + budget_tokens, OpenAI Responses: reasoning.type="enabled" - "disabled": No thinking. Anthropic: thinking.type="disabled", Google: thinking_budget=0, OpenAI Responses: reasoning.type="disabled"

Provider mappings for effort: - Anthropic: output_config.effort - OpenAI Chat: reasoning_effort - OpenAI Responses: reasoning.effort - Google: thinking_config.thinking_level

Provider mappings for budget_tokens: - Anthropic: thinking.budget_tokens - Google: thinking_config.thinking_budget

ReasoningDeltaEvent

Bases: TypedDict

Reasoning/thinking content delta event.

Emitted when a new reasoning/thinking text fragment is received from the model.

ReasoningPart

Bases: TypedDict

推理过程内容(如OpenAI的reasoning或Anthropic的thinking)。 Reasoning process content (e.g. OpenAI's reasoning or Anthropic's thinking).

用于存储模型的思考过程,通常不显示给用户。 Used to store the model's thought process, usually not shown to the user.

有些provider只返回signature而不返回完整的reasoning内容。 Some providers only return signature without full reasoning content.

RefusalPart

Bases: TypedDict

拒绝响应内容(如OpenAI的refusal)。 Refusal response content (e.g. OpenAI's refusal).

当模型拒绝回答用户请求时使用,常见于安全过滤。 Used when the model refuses to answer the user's request, common in safety filtering.

ResponseFormatConfig

Bases: TypedDict

响应格式配置 Response format configuration

用于控制响应内容的格式: - OpenAI: response_format - Google: response_mime_type + response_schema

SessionControl

Bases: TypedDict

会话控制指令,用于控制工具调用的执行。 Session control instructions, used to control the execution of tool calls.

Examples:

  • 取消工具调用 Cancel tool call
  • 修改工具调用参数 Modify tool call parameters
  • 暂停/恢复工具执行 Pause/resume tool execution

StreamConfig

Bases: TypedDict

流式输出配置 Streaming configuration

StreamEndEvent

Bases: TypedDict

Emitted at the end of a stream.

Signals that no more events will follow. Converters emit this after processing the final provider chunk.

StreamStartEvent

Bases: TypedDict

Emitted at the beginning of a stream. Carries session-level metadata.

This event is synthesized by the converter when the first provider chunk arrives, providing a unified place for response-level information such as the response ID and model name.

StreamingMetadata

Bases: TypedDict

流式传输的元数据 Metadata for streaming transmission

SystemEvent

Bases: TypedDict

系统级事件,用于记录会话状态变化。 System-level events, used to record session state changes.

Examples:

  • 会话开始/结束 Session start/end
  • 会话暂停/恢复 Session pause/resume
  • 超时警告 Timeout warning
  • 错误事件 Error event

SystemMessage

Bases: TypedDict

系统消息类型,用于系统指令。 System message type for system instructions.

role字段预填写为"system",使用时不需要手动指定角色。 The role field is pre-filled as "system", no need to manually specify the role when using.

内容限制:只允许文本内容。 Content restrictions: Only text content is allowed.

TextDeltaEvent

Bases: TypedDict

Text content delta event.

Emitted when a new text fragment is received from the model.

TextPart

Bases: TypedDict

纯文本内容 Plain text content

ToolCallConfig

Bases: TypedDict

工具调用配置(少见参数) Tool call configuration (less common parameters)

这些参数不是所有provider都支持,放在这里但主要通过provider_extensions使用: - disable_parallel: 禁用并行工具调用 - max_calls: 最大工具调用数

ToolCallDeltaEvent

Bases: TypedDict

Tool call arguments delta event.

Emitted when a new fragment of tool call arguments JSON string is received.

ToolCallPart

Bases: TypedDict

工具调用内容。 Tool call content.

使用两层类型系统: - type: 固定为 "tool_call" - tool_type: 区分不同的工具类型(function, mcp, web_search等) Uses a two-layer type system: - type: fixed as "tool_call" - tool_type: distinguishes different tool types (function, mcp, web_search, etc.)

这样设计避免了类型爆炸,同时保持扩展性。 This design avoids type explosion while maintaining extensibility.

provider_metadata字段用于存储provider特定的元数据,例如: - Google的thought_signature(Gemini 3必需,Gemini 2.5推荐) - 其他provider的特殊字段 The provider_metadata field is used to store provider-specific metadata, e.g.: - Google's thought_signature (required for Gemini 3, recommended for Gemini 2.5) - Other provider's special fields

ToolCallStartEvent

Bases: TypedDict

Tool call start event.

Emitted when a new tool call begins, providing the tool call ID and name.

ToolChainNode

Bases: TypedDict

工具链节点,用于表示工具调用的依赖关系。 Tool chain node, used to represent dependencies between tool calls.

支持DAG结构,一个工具的输出可以作为另一个工具的输入。 Supports DAG structure, the output of one tool can be used as the input of another.

Examples:

  • 搜索 → 总结 Search → Summarize
  • 数据获取 → 分析 → 可视化 Data acquisition → Analysis → Visualization

ToolChoice

Bases: TypedDict

工具选择配置 Tool choice configuration

统一了各provider的工具选择策略: - none: 不使用工具 - auto: 自动决定是否使用工具 - any: 必须使用某个工具(Anthropic的"any") - tool: 使用指定的工具(需要tool_name)

ToolDefinition

Bases: TypedDict

工具定义 Tool definition

统一了各provider的工具定义格式: - OpenAI Chat: {"type": "function", "function": {...}} - OpenAI Responses: {"type": "function", "name": "...", ...} - Anthropic: {"name": "...", "description": "...", "input_schema": {...}} - Google: {"function_declarations": [{"name": "...", ...}]}

Source converter contract: Provider tool types that fall outside the type Literal below (e.g. OpenAI Responses' "custom" hosted tools, or unnamed hosted tools like "web_search") MUST be coerced to "function" at the provider→IR boundary so that runtime IR validation (validate_ir_request) accepts the result. Provider-specific information may be retained in metadata (e.g. {"provider_type": "custom"}) or in the _passthrough extension for round-tripping. Target converters' downgrade fallbacks (e.g. openai_chat/tool_ops.py) are defensive only — IR is guaranteed valid by validation.

ToolMessage

Bases: TypedDict

工具消息类型,用于工具调用的结果。 Tool message type for tool call results.

这是新增的独立角色,替代了之前将工具结果放在user消息中的做法。 This is a new independent role, replacing the previous practice of putting tool results in user messages.

role字段预填写为"tool",使用时不需要手动指定角色。 The role field is pre-filled as "tool", no need to manually specify the role when using.

内容限制:只允许工具结果内容。 Content restrictions: Only tool result content is allowed.

ToolResultPart

Bases: TypedDict

工具调用的结果。 Tool call result.

对应一个ToolCallPart,通过tool_call_id关联。 Corresponds to a ToolCallPart, linked by tool_call_id.

UsageEvent

Bases: TypedDict

Usage statistics event.

Emitted when token usage statistics are available (typically at the end of stream).

UsageInfo

Bases: TypedDict

Token使用统计信息。 Token usage statistics.

来自各SDK的usage/usage_metadata字段,用于计费和监控。 From each SDK's usage/usage_metadata field, used for billing and monitoring.

UserMessage

Bases: TypedDict

用户消息类型,用于用户输入。 User message type for user input.

role字段预填写为"user",使用时不需要手动指定角色。 The role field is pre-filled as "user", no need to manually specify the role when using.

内容限制:文本、图像,未来支持文件、音频。 Content restrictions: Text, images, future support for files and audio.