docs: 添加 API 参考文档和技术分析文档

2026-04-19 01:43:02 +08:00
parent 2b1c5e96c3
commit b92974716f
14 changed files with 32227 additions and 0 deletions
--- a/docs/analysis_reference/analysis_litellm.md
+++ b/docs/analysis_reference/analysis_litellm.md
@@ -0,0 +1,575 @@
+# LiteLLM 大模型 API 代理转换层深度分析报告
+
+## 1. 项目概述
+
+LiteLLM 是一个统一的 LLM API 网关和 SDK 项目，核心能力是将 100+ 种不同来源的大模型 API 统一管理，并以 **OpenAI 格式** 作为统一接口标准提供给用户。
+
+### 整体请求流向
+
+```
+OpenAI SDK (客户端)     ──▶  LiteLLM AI Gateway (proxy/)  ──▶  LiteLLM SDK (litellm/)  ──▶  LLM API
+Anthropic SDK (客户端)  ──▶  LiteLLM AI Gateway (proxy/)  ──▶  LiteLLM SDK (litellm/)  ──▶  LLM API
+任意 HTTP 客户端        ──▶  LiteLLM AI Gateway (proxy/)  ──▶  LiteLLM SDK (litellm/)  ──▶  LLM API
+```
+
+- **AI Gateway（代理层）**：在 SDK 之上提供认证、速率限制、预算管理和路由功能
+- **SDK（核心层）**：处理实际的 LLM 提供商调用、请求/响应转换和流式传输
+
+### 与同类项目的核心差异
+
+LiteLLM 是**覆盖最广**的 LLM 网关，以 **OpenAI 格式为唯一内部规范格式**（Canonical Format），所有转换都围绕 OpenAI ↔ 提供商格式展开。同时支持 **Anthropic Messages API 直通模式**（`/v1/messages` 端点），使其也能充当 Anthropic 协议的代理网关。SDK 可独立使用（Python 包），也可配合 Gateway 作为服务部署。
+
+---
+
+## 2. 转换层技术架构
+
+### 2.1 架构设计模式
+
+采用 **策略模式 + 工厂模式 + 模板方法** 组合设计：
+
+- **策略模式**：每个 LLM 提供商都有一个独立的 `Config` 类，继承自统一的 `BaseConfig` 抽象基类，实现各自的转换方法
+- **工厂模式**：`ProviderConfigManager` 作为中心工厂，根据模型名称和提供商类型动态解析出正确的 Config 实例
+- **模板方法**：`BaseConfig` 定义统一的转换接口（抽象方法），子类实现具体的提供商差异
+
+### 2.2 核心抽象基类
+
+**文件**: `litellm/llms/base_llm/chat/transformation.py` (466 行)
+
+```python
+class BaseConfig(ABC):
+    @abstractmethod
+    def get_supported_openai_params(self, model: str) -> list:
+        """返回该提供商支持的 OpenAI 格式参数列表"""
+
+    @abstractmethod
+    def map_openai_params(self, non_default_params, optional_params, model, drop_params) -> dict:
+        """将 OpenAI 格式参数映射为提供商特定参数"""
+
+    @abstractmethod
+    def validate_environment(self, headers, model, messages, optional_params, ...) -> dict:
+        """验证并构建请求头"""
+
+    @abstractmethod
+    def transform_request(self, model, messages, optional_params, litellm_params, headers) -> dict:
+        """将 OpenAI 格式请求转换为提供商格式"""
+
+    @abstractmethod
+    def transform_response(self, model, raw_response, model_response, ...) -> ModelResponse:
+        """将提供商响应转换为统一的 OpenAI ModelResponse 格式"""
+
+    @abstractmethod
+    def get_error_class(self, error_message, status_code, headers) -> BaseLLMException:
+        """返回提供商特定的错误类"""
+
+    # 以下为有默认实现的重要方法：
+    def get_complete_url(self, api_base, api_key, model, ...) -> str: ...    # URL 构建
+    def sign_request(self, headers, optional_params, request_data, ...) -> ...: ...  # 请求签名
+    def get_model_response_iterator(self, ...) -> BaseModelResponseIterator: ...  # 流式迭代器
+    def should_fake_stream(self, ...) -> bool: ...     # 是否需要模拟流式
+    def is_thinking_enabled(self, ...) -> bool: ...    # 思考模式检测
+    def calculate_additional_costs(self, ...) -> float: ...  # 自定义成本计算
+    def post_stream_processing(self, ...) -> ...: ...  # 流式后处理钩子
+```
+
+**6 个抽象方法**必须实现，其余方法提供默认实现。
+
+### 2.3 中央 HTTP 编排器
+
+**文件**: `litellm/llms/custom_httpx/llm_http_handler.py` (12,161 行)
+
+`BaseLLMHTTPHandler` 是所有提供商共用的 HTTP 编排器，支持**多种模态**（不仅限于 chat）：
+
+```
+completion() 编排流程：
+1. provider_config.validate_environment()  → 验证环境、构建请求头
+2. provider_config.get_complete_url()       → 构建 API URL
+3. provider_config.transform_request()      → 转换请求体
+4. provider_config.sign_request()           → 签名请求（如 AWS Bedrock）
+5. 发送 HTTP 请求到提供商 API
+6. provider_config.transform_response()     → 转换响应为 ModelResponse
+```
+
+**额外支持的模态**: embedding、rerank、audio_transcription、OCR、vector_store、files、batches、image_generation、video_generation、responses API 等。
+
+**关键设计**: 编排器本身不需要修改，所有提供商的差异都封装在各自的 Config 类中。此外编排器还支持 **HTTP 错误重试**模式 — 通过 `should_retry_llm_api_inside_llm_translation_on_http_error()` 在特定错误时自动修复并重试。
+
+### 2.4 请求入口与提供商解析
+
+**入口文件**: `litellm/main.py` 的 `completion()` 函数（约 7,807 行，`completion()` 起始于 line 1052）
+
+**提供商解析文件**: `litellm/litellm_core_utils/get_llm_provider_logic.py` 的 `get_llm_provider()` 函数 (958 行)
+
+解析优先级：
+
+1. **LiteLLM Proxy 默认**: 检查是否应使用 litellm_proxy
+2. **litellm_params**: 从 router 参数获取
+3. **Azure 特殊处理**: Azure 非 OpenAI 模型
+4. **Cohere/Anthropic 文本模型重定向**
+5. **OpenRouter 前缀剥离**
+6. **JSON 配置提供商**: `JSONProviderRegistry` 动态注册的 OpenAI 兼容提供商
+7. **前缀提供商**: `model.split("/")[0]` 在提供商列表中匹配
+8. **API Base 匹配**: ~40 个已知 API 端点域名匹配
+9. **已知模型名查找**: `"gpt-4"` → `"openai"`，`"claude-3"` → `"anthropic"`
+10. **字符串前缀回退**: `model.startswith("bytez/")` → `"bytez"`
+11. **错误**: 无法解析则抛出 `BadRequestError`
+
+**Config 解析**: `litellm/utils.py` 的 `ProviderConfigManager`（line 8013），通过 `_PROVIDER_CONFIG_MAP` 懒加载字典进行 O(1) 查找，约 90 个提供商映射。
+
+**派发模式**: `main.py` 中仍保留一个较大的 `if/elif` 链用于提供商分派。Azure 和 OpenAI 仍使用专用处理路径，约 30 个提供商通过 `base_llm_http_handler.completion()` 处理。
+
+### 2.5 多模态工厂
+
+`ProviderConfigManager` 提供多个工厂方法：
+
+| 工厂方法 | 用途 | 提供商数量 |
+|---------|------|-----------|
+| `get_provider_chat_config()` | Chat Completions | ~90 |
+| `get_provider_embedding_config()` | Embedding 向量 | ~25 |
+| `get_provider_rerank_config()` | Rerank 重排序 | ~13 |
+| `get_provider_anthropic_messages_config()` | Anthropic Messages 直通 | 4 |
+
+---
+
+## 3. 支持的协议/提供商总览
+
+### 3.1 支持规模
+
+| 维度 | 数量 |
+|------|------|
+| LLM 提供商枚举值 (`LlmProviders`) | 133 |
+| 搜索提供商枚举值 (`SearchProviders`) | 12 |
+| 实现目录数 (`litellm/llms/` 下) | ~118 |
+| `base_llm/` 子模态目录 | 29 (chat, embedding, rerank, responses, audio, OCR, image, video, etc.) |
+
+### 3.2 主要提供商分类
+
+| 分类 | 提供商 | 转换方式 |
+|------|--------|----------|
+| **OpenAI 及兼容** | OpenAI, Azure OpenAI, DeepSeek, Groq, Together AI, Fireworks, Perplexity, OpenRouter 等 | 几乎直通，参数过滤 |
+| **Anthropic 系列** | Anthropic 原生, Bedrock Claude, Vertex AI Claude | 深度转换（消息格式、工具调用、思考模式） |
+| **Google 系列** | Gemini (AI Studio), Vertex AI Gemini | 深度转换（contents/parts 格式、角色映射） |
+| **AWS 系列** | Bedrock Converse, Bedrock Invoke | Converse 统一格式 或 按提供商分派转换 |
+| **开源/本地** | Ollama, vLLM, LM Studio, llamafile | 参数重映射 + 消息格式适配 |
+| **其他云厂商** | Cohere, Mistral, Databricks, WatsonX, OCI, Snowflake 等 | 各自独立转换 |
+| **直通模式** | 所有提供商 | 请求体不变，仅转换 URL 和认证头 |
+
+---
+
+## 4. 核心转换文件详解
+
+### 4.1 转换文件总览表
+
+| 入站 API 协议 | 目标提供商 | 转换文件 |
+|---------------|-----------|----------|
+| `/v1/chat/completions` (OpenAI) | Anthropic | `litellm/llms/anthropic/chat/transformation.py` |
+| `/v1/chat/completions` (OpenAI) | Bedrock Converse | `litellm/llms/bedrock/chat/converse_transformation.py` |
+| `/v1/chat/completions` (OpenAI) | Bedrock Invoke | `litellm/llms/bedrock/chat/invoke_transformations/*.py` |
+| `/v1/chat/completions` (OpenAI) | Gemini | `litellm/llms/gemini/chat/transformation.py` (薄层) → `litellm/llms/vertex_ai/gemini/transformation.py` (核心) |
+| `/v1/chat/completions` (OpenAI) | Vertex AI | `litellm/llms/vertex_ai/vertex_ai_partner_models/main.py` |
+| `/v1/chat/completions` (OpenAI) | OpenAI | `litellm/llms/openai/chat/gpt_transformation.py` |
+| `/v1/chat/completions` (OpenAI) | Ollama | `litellm/llms/ollama/chat/transformation.py` |
+| `/v1/chat/completions` (OpenAI) | Cohere | `litellm/llms/cohere/chat/transformation.py` |
+| `/v1/chat/completions` (OpenAI) | Mistral | `litellm/llms/mistral/chat/transformation.py` |
+| `/v1/messages` (Anthropic) | Anthropic/Bedrock/Vertex | `litellm/llms/base_llm/anthropic_messages/transformation.py` |
+| 直通端点 | 所有 | `litellm/proxy/pass_through_endpoints/` |
+
+---
+
+### 4.2 Anthropic 转换层
+
+**文件**: `litellm/llms/anthropic/chat/transformation.py` (2,096 行)
+**类**: `AnthropicConfig(AnthropicModelInfo, BaseConfig)`
+
+#### 请求转换（OpenAI → Anthropic）
+
+`transform_request()` 方法核心转换逻辑：
+
+1. **系统消息提取**: 将 `role: "system"` 消息从 messages 列表中提取出来，放入 Anthropic 顶层 `system` 字段（Anthropic API 不使用 system-role 消息）
+2. **消息格式转换**: 调用 `anthropic_messages_pt()` 将 OpenAI 消息转为 Anthropic Messages 格式
+3. **工具调用转换**: OpenAI 的 `function.parameters` → Anthropic 的 `input_schema`；工具选择 `"required"` → `"any"`，`parallel_tool_calls: false` → `disable_parallel_tool_use: true`
+4. **Beta Header 管理**: 根据使用的功能（web search、memory、structured output、computer use、code execution、MCP server 等）自动添加 `anthropic-beta` 请求头
+5. **Thinking 参数处理**: 管理 Anthropic Extended Thinking 参数
+6. **JSON 模式**: 旧模型通过创建合成工具实现 JSON 模式；新模型（Sonnet 4.5+）原生支持结构化输出
+
+**关键参数映射**:
+
+| OpenAI 参数 | Anthropic 参数 | 说明 |
+|-------------|---------------|------|
+| `max_tokens` / `max_completion_tokens` | `max_tokens` | 直接映射，默认值可通过环境变量配置 |
+| `tools[].function.parameters` | `tools[].input_schema` | JSON Schema 格式转换 |
+| `tool_choice: "required"` | `tool_choice: {"type": "any"}` | 语义映射 |
+| `parallel_tool_calls: false` | `disable_parallel_tool_use: true` | 反转布尔值 |
+| `stop` | `stop_sequences` | 字符串 → 列表 |
+| `response_format` (JSON) | 合成工具方式 / `output_format` | 按模型能力选择 |
+| `reasoning_effort` | `thinking` + `output_config` | 努力级别映射为思考预算 |
+| `web_search_options` | `tools` (web_search hosted tool) | 转换为 Anthropic 原生搜索工具 |
+| `user` | `metadata.user_id` | 嵌套到 metadata |
+
+#### 响应转换（Anthropic → OpenAI）
+
+1. **内容块解析**: text → 文本, tool_use → ToolCall, thinking/redacted_thinking → `reasoning_content` + `thinking_blocks`
+2. **JSON 模式逆向转换**: 如果启用了 JSON 模式，将工具调用的参数提取为文本内容
+3. **停止原因映射**: `end_turn`/`stop_sequence` → `stop`, `max_tokens` → `length`, `tool_use` → `tool_calls`
+4. **Usage 计算**: 输入/输出 token、缓存 token、推理 token 等
+
+---
+
+### 4.3 OpenAI 转换层（基准实现）
+
+**文件**: `litellm/llms/openai/chat/gpt_transformation.py` (820 行)
+**类**: `OpenAIGPTConfig(BaseLLMModelInfo, BaseConfig)` — 标记 `_is_base_class = True`
+
+OpenAI 是"原生"格式提供者，其转换层最简单：
+
+- **`transform_request()`**: 仅做少量规范化（image_url 格式化、移除 `cache_control`、PDF URL 转 base64）
+- **`transform_response()`**: JSON 解析后直接构建 `ModelResponse`
+- **`map_openai_params()`**: 简单的参数过滤（只保留支持的参数列表）
+
+**作为基类被复用**:
+- `OpenAIOSeriesConfig` — O1/O3 推理模型（`system` 角色转 `user`，限制 temperature）
+- `OpenAIGPTAudioConfig` — GPT-4o 音频模型
+- `OpenAIGPT5Config` — GPT-5 系列（推理努力级别标准化）
+- 许多 OpenAI 兼容提供商（DeepSeek、Groq、Together AI 等）也复用此类
+- **Mistral** 也继承自 `OpenAIGPTConfig`（API 大部分兼容，额外处理 schema 清理和 thinking）
+
+---
+
+### 4.4 Google Gemini 转换层
+
+**文件**:
+- `litellm/llms/gemini/chat/transformation.py` (154 行) — 薄层，继承 Vertex Gemini
+- `litellm/llms/vertex_ai/gemini/transformation.py` (941 行) — 核心转换逻辑
+
+**类继承链**:
+```
+BaseConfig → VertexAIBaseConfig → VertexGeminiConfig → GoogleAIStudioGeminiConfig
+```
+
+#### 请求转换（OpenAI → Gemini）
+
+这是最复杂的转换之一：
+
+| 方面 | OpenAI 格式 | Gemini 格式 |
+|------|-----------|------------|
+| 消息容器 | `messages: [{role, content}]` | `contents: [{role, parts}]` |
+| 角色类型 | `system`, `user`, `assistant`, `tool` | 仅 `user` 和 `model` |
+| 内容格式 | 字符串或内容对象列表 | `PartType` 对象列表 |
+| 工具结果 | 独立的 `role: "tool"` 消息 | 合并到 `role: "user"` 的 parts |
+| 连续消息 | 允许任意顺序 | 必须严格交替 user/model |
+
+转换算法核心步骤：
+1. **系统消息提取**: 放入 `system_instruction` 字段
+2. **角色映射**: `user`/`system` → `user`, `assistant` → `model`, `tool`/`function` → `user`
+3. **角色交替强制**: 合并连续相同角色的消息
+4. **媒体处理**: image_url → `inline_data` (base64) 或 `file_data` (GCS URI)
+5. **工具调用转换**: OpenAI `tool_calls` → Gemini `functionCall` parts
+6. **搜索工具冲突处理**: 搜索工具与函数声明互斥时优先保留函数声明
+
+**关键参数映射**:
+
+| OpenAI 参数 | Gemini 参数 | 说明 |
+|-------------|-----------|------|
+| `max_tokens` | `max_output_tokens` | 名称变更 |
+| `stop` | `stop_sequences` | 字符串 → 列表 |
+| `n` | `candidate_count` | 名称变更 |
+| `response_format` | `response_mime_type` + `response_schema` | 复杂 Schema 转换 |
+| `reasoning_effort` | `thinkingConfig` | 努力级别 → 思考预算 |
+| `tool_choice: "none"` | `NONE` | 大写枚举 |
+| `tool_choice: "required"` | `ANY` | 语义映射 |
+| `modalities: ["image"]` | `responseModalities: ["IMAGE"]` | 大写枚举 |
+| `web_search_options` | `tools` (googleSearch) | 转换为 Gemini 搜索工具 |
+
+---
+
+### 4.5 AWS Bedrock 转换层
+
+Bedrock 是架构最复杂的转换层，支持 **四条独立转换路径**：
+
+#### 路径 1：Converse API（推荐）
+
+**文件**: `litellm/llms/bedrock/chat/converse_transformation.py` (2,129 行)
+**类**: `AmazonConverseConfig(BaseConfig)`
+
+使用 Bedrock 的统一 Converse API，所有提供商共享同一套请求/响应结构：
+
+```
+请求格式：{
+  "messages": [...],           # Bedrock MessageBlock 格式
+  "system": [...],             # SystemContentBlock 格式
+  "inferenceConfig": {...},    # maxTokens, temperature, topP, topK 等
+  "additionalModelRequestFields": {...},  # 提供商特定字段
+  "toolConfig": {...},         # 工具配置
+  "guardrailConfig": {...},    # 护栏配置
+}
+```
+
+支持最丰富的功能集：工具调用、思考模式、结构化输出、护栏、Web 搜索、Computer Use 等。
+
+#### 路径 2：Invoke API（传统/遗留）
+
+**文件**: `litellm/llms/bedrock/chat/invoke_transformations/` — 15 个提供商特定文件
+**基类**: `AmazonInvokeConfig(BaseConfig, BaseAWSLLM)`
+
+采用 **分派模式 (Dispatcher Pattern)**：`get_bedrock_chat_config()` 从模型名解析提供商，然后委托给对应的转换类：
+
+```
+AmazonInvokeConfig.transform_request()
+  → get_bedrock_invoke_provider(model)  # 从模型名解析提供商
+  → 按提供商委托：
+    - "anthropic" → AmazonAnthropicClaudeConfig（复用 AnthropicConfig）
+    - "cohere"    → 构建 Cohere 格式请求
+    - "meta"      → {"prompt": ...} 文本格式
+    - "nova"      → AmazonInvokeNovaConfig
+    - "deepseek_r1", "qwen3", "qwen2", "moonshot", "openai" 等 → 各自的转换器
+```
+
+**Anthropic Claude Invoke 特殊设计**: `AmazonAnthropicClaudeConfig` 通过多重继承 `AmazonInvokeConfig + AnthropicConfig`，复用 Anthropic 原生转换逻辑，但去除 Bedrock 不兼容的字段并设置 `anthropic_version = "bedrock-2023-05-31"`。
+
+#### 路径 3：Converse-like — Converse 的轻量包装器
+
+#### 路径 4：Invoke Agent — Bedrock Agent 编排调用的专用路径
+
+**路由决策** (`get_bedrock_chat_config()`): 先按 `bedrock_route`（converse/openai/agent/agentcore）选择路径，再按 `bedrock_invoke_provider` 选择子提供商。
+
+---
+
+### 4.6 Vertex AI 转换层
+
+**文件**: `litellm/llms/vertex_ai/` — 多提供商门面 (Multi-Provider Facade)
+
+Vertex AI 在 Google Cloud 上托管了多种第三方模型：
+
+| 模型类型 | Config 类 / 处理方式 | 转换方式 |
+|----------|----------|----------|
+| Gemini | `VertexGeminiConfig` | 深度转换（同 Gemini 格式） |
+| Anthropic Claude | `VertexAIAnthropicConfig` | 复用 AnthropicConfig + Vertex 认证 + `anthropic_version: "vertex-2023-10-16"` |
+| Meta Llama | `OpenAILikeChatHandler` / `base_llm_http_handler` | OpenAI 兼容端点 |
+| AI21 Jamba | `VertexAIAi21Config` | 继承 OpenAIGPTConfig |
+| DeepSeek / Qwen / Minimax / Moonshot | `base_llm_http_handler` | OpenAI 兼容端点 |
+| Mistral/Codestral | `MistralConfig` / `CodestralTextCompletionConfig` | OpenAI 兼容 |
+
+**调度器**: `VertexAIPartnerModels` 类通过 `PartnerModelPrefixes` 枚举（META, DEEPSEEK, MISTRAL, CODERESTAL, JAMBA, CLAUDE, QWEN, GPT_OSS 等）路由到正确的处理器。
+
+---
+
+### 4.7 其他重要提供商转换
+
+#### Ollama（本地模型）
+
+**文件**: `litellm/llms/ollama/chat/transformation.py` (580 行)
+
+| OpenAI 参数 | Ollama 参数 | 说明 |
+|-------------|-----------|------|
+| `max_tokens` | `num_predict` | 名称变更 |
+| `frequency_penalty` | `repeat_penalty` | 语义差异 |
+| `response_format(json_object)` | `format: "json"` | 简化映射 |
+| `response_format(json_schema)` | `format: <schema>` | 直接传入 schema |
+| `reasoning_effort` | `think` | 布尔/字符串 |
+
+支持 Ollama 特有参数（Mirostat 采样、num_ctx、num_gpu、num_thread 等），支持从 `<think/>` 标签中提取推理内容。
+
+#### Cohere
+
+- **v1 API**: `CohereChatConfig(BaseConfig)` — 完全独立的转换（`chat_history` + 最新消息分离）
+- **v2 API**: `CohereV2ChatConfig` — v2 是 OpenAI 兼容的，复用 OpenAI 转换
+- 通过 `_get_cohere_config(model)` 根据模型选择版本
+
+#### Mistral
+
+**类**: `MistralConfig(OpenAIGPTConfig)` — **继承自 OpenAI Config**（非直接继承 BaseConfig）
+
+特殊处理：工具 Schema 清理（移除 `$id`、`$schema`、`additionalProperties` 等 Mistral 不支持的字段），Magistral 模型的推理通过注入专用系统提示实现。
+
+---
+
+### 4.8 Anthropic Messages API 直通层
+
+**文件**: `litellm/llms/base_llm/anthropic_messages/transformation.py` (165 行)
+**基类**: `BaseAnthropicMessagesConfig(ABC)` — **独立的**抽象基类，与 `BaseConfig` 并行
+
+这是一条与标准 OpenAI 格式 `BaseConfig` 并行的转换路径，用于直接接受 Anthropic Messages API 格式（`/v1/messages`）的请求并转发到多个云提供商：
+
+| 提供商 | 实现类 | 认证方式 |
+|--------|--------|----------|
+| Anthropic 原生 | `AnthropicMessagesConfig` | API Key |
+| AWS Bedrock | `BedrockModelInfo.get_bedrock_provider_config_for_messages_api()` | AWS SigV4 |
+| Vertex AI Claude | `VertexAIPartnerModelsAnthropicMessagesConfig` | Google OAuth |
+| Azure AI Claude | `AzureAnthropicMessagesConfig` | Azure Token |
+
+**关键抽象方法** (5 个):
+1. `validate_anthropic_messages_environment()` — 返回 (headers, api_base)
+2. `get_complete_url()` — URL 构建
+3. `get_supported_anthropic_messages_params()` — 支持的参数列表
+4. `transform_anthropic_messages_request()` — 请求转换
+5. `transform_anthropic_messages_response()` — 响应转换
+
+**自修复能力**: 支持在 thinking signature 无效时自动重试（`should_retry_anthropic_messages_on_http_error()`，最多 2 次）。
+
+### 4.9 Proxy 直通模式
+
+**目录**: `litellm/proxy/pass_through_endpoints/`
+
+直通模式允许用户直接调用提供商的原生 API，**请求体不做任何转换**，仅转换：
+- **URL**: 将代理路由映射到提供商的实际 API 端点
+- **认证**: 将 LiteLLM 虚拟密钥替换为提供商的 API 密钥（`PassthroughEndpointRouter` 管理凭证）
+- **元数据**: 注入日志和计费信息
+
+主要文件: `llm_passthrough_endpoints.py` (2,371 行)、`pass_through_endpoints.py`、`passthrough_guardrails.py` 等。
+
+---
+
+## 5. 流式传输处理
+
+### 5.1 统一流式包装器
+
+**文件**: `litellm/litellm_core_utils/streaming_handler.py` (2,418 行)
+**类**: `CustomStreamWrapper`
+
+所有提供商共用的流式响应统一化层，实现 `__iter__`/`__next__`（同步）和 `__aiter__`/`__anext__`（异步）。
+
+**核心设计**: 引入 `GenericStreamingChunk` 统一接口：
+```python
+GenericStreamingChunk = {
+    "text": str,
+    "is_finished": bool,
+    "finish_reason": str,
+    "usage": Usage,
+    "tool_use": [...],
+}
+```
+
+所有实现了 `BaseConfig.get_model_response_iterator()` 的提供商都返回符合此格式的块，`CustomStreamWrapper` 可以统一处理。
+
+**关键功能**:
+- 特殊 token 处理（`<|im_start|>`, `<|im_end|>` 等）
+- Stream options 支持（`include_usage`）
+- 最大流式持续时间强制
+- `merge_reasoning_content_in_choices` 支持
+- `aclose()` 通过 `anyio.CancelScope` 正确清理
+
+### 5.2 提供商特定的流式解析器
+
+每个提供商都有独立的 `ModelResponseIterator` 实现：
+
+| 提供商 | 解析器位置 | 特殊处理 |
+|--------|-----------|---------|
+| Anthropic | `llms/anthropic/chat/handler.py` | SSE 事件类型分派（message_start/content_block_delta/message_delta） |
+| OpenAI | `llms/openai/chat/gpt_transformation.py` | 几乎直通 |
+| Gemini | Vertex AI 流式处理 | `contents[0].parts[0].text` 格式 |
+| Bedrock Converse | `llms/bedrock/chat/converse_handler.py` | AWS 事件流格式 |
+| Ollama | 内置于 transformation.py | JSON 行格式，`<think/>` 标签解析 |
+
+### 5.3 Fake Stream 机制
+
+部分提供商（如某些 Bedrock Invoke 模型）不支持真正的流式传输。LiteLLM 通过 `should_fake_stream()` 检测这种情况，先发送非流式请求获取完整响应，然后通过 `MockResponseIterator` 逐块模拟流式输出。
+
+---
+
+## 6. 转换层设计模式总结
+
+### 6.1 三层架构
+
+```
+┌─────────────────────────────────────────┐
+│          Proxy Layer (proxy/)           │  认证、限流、路由、计费
+├─────────────────────────────────────────┤
+│          SDK Layer (litellm/)           │  统一入口、参数映射
+│  ┌─────────────────────────────────┐    │
+│  │    BaseLLMHTTPHandler           │    │  HTTP 编排器（不变）
+│  │  ┌───────────────────────────┐  │    │
+│  │  │ ProviderConfig (BaseConfig)│  │    │  转换策略（可变）
+│  │  │  • transform_request()    │  │    │
+│  │  │  • transform_response()   │  │    │
+│  │  │  • map_openai_params()    │  │    │
+│  │  └───────────────────────────┘  │    │
+│  └─────────────────────────────────┘    │
+├─────────────────────────────────────────┤
+│         HTTP Client (httpx)             │  底层 HTTP 通信
+└─────────────────────────────────────────┘
+```
+
+### 6.2 关键设计模式
+
+| 模式 | 应用位置 | 说明 |
+|------|---------|------|
+| **策略模式** | `BaseConfig` + 各提供商 Config | 每个提供商的转换逻辑独立封装 |
+| **工厂模式** | `ProviderConfigManager` | 根据模型/提供商动态创建 Config 实例（含子工厂） |
+| **分派模式** | `AmazonInvokeConfig`, `VertexAIPartnerModels` | 根据模型名检测子提供商并委托转换 |
+| **模板方法** | `BaseConfig` 抽象方法 | 定义统一的转换接口，子类实现细节 |
+| **门面模式** | `VertexAIPartnerModels` | 统一入口管理多种 Vertex AI 合作模型 |
+| **适配器模式** | `AmazonAnthropicClaudeConfig` | 通过多重继承适配不同 API（Bedrock + Anthropic） |
+| **装饰器模式** | `CustomStreamWrapper` | 统一包装各提供商的流式响应 |
+
+### 6.3 转换类型分类
+
+| 转换类型 | 复杂度 | 典型提供商 | 说明 |
+|----------|--------|-----------|------|
+| **直通型** | 低 | OpenAI, Azure, DeepSeek, Groq | 格式基本兼容，仅过滤/重命名少量参数 |
+| **参数重映射型** | 中 | Ollama, Cohere v1 | 参数名称变更，消息格式基本兼容 |
+| **深度转换型** | 高 | Anthropic, Gemini, Bedrock Converse | 消息结构完全不同，角色系统不同，工具调用格式不同 |
+| **分派委托型** | 高 | Bedrock Invoke, Vertex AI | 根据子提供商委托到不同的转换器 |
+| **直通代理型** | 极低 | Proxy 直通模式 | 仅转换 URL 和认证，请求体不变 |
+
+### 6.4 添加新提供商的标准流程
+
+1. 在 `litellm/llms/{provider}/chat/transformation.py` 创建 Config 类
+2. 实现 `BaseConfig` 的 6 个抽象方法
+3. 在 `ProviderConfigManager._build_provider_config_map()` 中注册
+4. 在 `get_llm_provider()` 中添加提供商解析规则
+5. 在 `tests/llm_translation/test_{provider}.py` 添加单元测试
+
+---
+
+## 7. 关键目录结构图
+
+```
+litellm/
+├── main.py                          # 统一入口：completion(), acompletion() (7,807行)
+├── utils.py                         # ProviderConfigManager, 工厂方法 (9,568行)
+├── types/
+│   ├── llms/openai.py               # OpenAI 格式类型定义
+│   └── utils.py                     # LlmProviders 枚举 (133个), SearchProviders 枚举 (12个)
+├── llms/
+│   ├── base_llm/                    # 抽象基类目录 (29子目录, 覆盖多种模态)
+│   │   ├── chat/transformation.py   # ★ BaseConfig 抽象基类 (466行)
+│   │   ├── anthropic_messages/      # Anthropic Messages 直通基类
+│   │   ├── embedding/               # Embedding 基类
+│   │   ├── rerank/                  # Rerank 基类
+│   │   └── ... (其他模态)
+│   ├── custom_httpx/
+│   │   ├── llm_http_handler.py      # ★ BaseLLMHTTPHandler 编排器 (12,161行)
+│   │   └── http_handler.py          # HTTP 客户端封装
+│   ├── anthropic/chat/transformation.py      # Anthropic 转换 (2,096行)
+│   ├── openai/chat/gpt_transformation.py     # OpenAI 转换 (820行)
+│   ├── gemini/chat/transformation.py         # Gemini 转换 (薄层, 154行)
+│   ├── vertex_ai/gemini/transformation.py    # Gemini 核心转换 (941行)
+│   ├── vertex_ai/vertex_ai_partner_models/   # Vertex AI 合作模型门面
+│   ├── bedrock/chat/
+│   │   ├── converse_transformation.py        # Bedrock Converse 转换 (2,129行)
+│   │   └── invoke_transformations/           # Bedrock Invoke (15个子提供商)
+│   ├── ollama/chat/transformation.py         # Ollama 转换 (580行)
+│   ├── cohere/chat/transformation.py         # Cohere 转换 (373行)
+│   ├── mistral/chat/transformation.py        # Mistral 转换 (686行, 继承OpenAI)
+│   └── ... (~118 个提供商目录)
+├── litellm_core_utils/
+│   ├── streaming_handler.py          # CustomStreamWrapper 统一流式处理 (2,418行)
+│   └── get_llm_provider_logic.py     # 提供商解析逻辑 (958行)
+├── proxy/
+│   └── pass_through_endpoints/       # 直通代理模式 (含凭证管理)
+└── router.py                         # 负载均衡、故障转移
+```
+
+---
+
+## 8. 总结
+
+LiteLLM 的转换层是一个设计精良的**统一 API 网关转换引擎**，其核心优势在于：
+
+1. **高度模块化**: 每个提供商的转换逻辑完全独立，修改一个提供商不影响其他提供商
+2. **统一抽象**: 所有提供商通过 `BaseConfig` 接口统一管理，编排器不需要感知提供商差异
+3. **广泛覆盖**: 支持 133 个 LLM 提供商、12 个搜索提供商，涵盖所有主流云厂商、开源模型和本地部署
+4. **多协议支持**: 同时支持 OpenAI Chat Completions、Anthropic Messages (直通)、Gemini GenerateContent 等多种协议
+5. **多模态支持**: 29 个 base_llm 子目录覆盖 Chat、Embedding、Rerank、Audio、OCR、Image、Video、Responses API 等模态
+6. **可扩展性**: 添加新提供商只需实现一个 Config 类并注册
+7. **流式兼容**: 通过 `GenericStreamingChunk` + `CustomStreamWrapper` (2,418行) 统一处理所有提供商的流式响应
+8. **自修复能力**: 支持 HTTP 错误时自动重试和修复（thinking signature、请求体重构等）