refactor(scripts): 拆分脚本为 init/ 和 detect/ 子目录,优化 init-llm.sh
This commit is contained in:
350
scripts/detect/README.md
Normal file
350
scripts/detect/README.md
Normal file
@@ -0,0 +1,350 @@
|
||||
# 兼容性检测脚本
|
||||
|
||||
## 概述
|
||||
|
||||
本目录包含一组用于检测 LLM API 网关对 **OpenAI** 和 **Anthropic** 协议兼容性的测试脚本。通过向目标服务发送一系列结构化请求,验证响应格式、字段类型、错误处理等是否符合协议规范。
|
||||
|
||||
## 脚本结构
|
||||
|
||||
```
|
||||
scripts/
|
||||
├── core.py # 公共基础设施
|
||||
├── detect_openai.py # OpenAI 兼容协议测试
|
||||
└── detect_anthropic.py # Anthropic 兼容协议测试
|
||||
```
|
||||
|
||||
### core.py — 公共模块
|
||||
|
||||
提供所有检测脚本共享的基础功能:
|
||||
|
||||
| 函数/类 | 说明 |
|
||||
|---------|------|
|
||||
| `TestCase` | 测试用例数据类(URL、方法、请求头、请求体、验证器) |
|
||||
| `TestResult` | 测试结果数据类(状态码、耗时、错误类型、响应内容) |
|
||||
| `http_request()` | 普通 HTTP 请求(支持重试、自动 JSON 序列化) |
|
||||
| `http_stream_request()` | 流式 HTTP 请求(SSE,支持重试) |
|
||||
| `parse_sse_events()` | 从 SSE 响应文本中提取 `data:` 事件列表 |
|
||||
| `create_ssl_context()` | 创建不验证证书的 SSL 上下文(测试环境用) |
|
||||
| `run_test()` | 执行单个用例并打印结构化输出 |
|
||||
| `run_test_suite()` | 执行完整测试套件并打印统计摘要 |
|
||||
| `check_required_fields()` | 检查必需字段(通用验证辅助) |
|
||||
| `check_field_type()` | 检查字段类型(通用验证辅助) |
|
||||
| `check_enum_value()` | 检查枚举值(通用验证辅助) |
|
||||
| `check_array_items_type()` | 检查数组元素类型(通用验证辅助) |
|
||||
| `validate_response_structure()` | 组合上述函数的通用验证器 |
|
||||
|
||||
**注意**:`core.py` 只包含协议无关的通用功能。每个协议独有的响应验证函数应定义在各自的检测脚本中(如 `validate_openai_chat_completion_response` 在 `detect_openai.py` 中)。
|
||||
|
||||
### detect_openai.py — OpenAI 兼容测试
|
||||
|
||||
检测目标服务对 OpenAI Chat Completions API 的兼容程度。
|
||||
|
||||
**覆盖的 API 端点:**
|
||||
- `GET /models` — 模型列表
|
||||
- `GET /models/{model}` — 模型详情
|
||||
- `POST /chat/completions` — 对话补全
|
||||
|
||||
**测试类别:**
|
||||
- **正面用例**:基本对话、system/developer 角色、多轮对话、参数组合(temperature、top_p、seed、penalty、stop、n、max_tokens、max_completion_tokens、logit_bias、reasoning_effort、service_tier、verbosity、response_format)
|
||||
- **扩展功能**:`--vision`(图片输入)、`--stream`(流式响应)、`--tools`(工具调用)、`--logprobs`(对数概率)、`json_schema`(结构化输出)
|
||||
- **负面用例**:缺参数、空消息、无效认证、不存在的模型、畸形 JSON、max_tokens 负数/0、temperature 越界
|
||||
|
||||
**响应验证:**
|
||||
- Models List:检查 `object: "list"`、`data` 数组中每个模型的 `id`、`object`、`created`、`owned_by`
|
||||
- Model Retrieve:检查 `id`、`object: "model"`、`created`、`owned_by`
|
||||
- Chat Completion:检查 `id`、`object: "chat.completion"`、`created`、`model`、`choices` 数组结构、`usage` 对象
|
||||
|
||||
### detect_anthropic.py — Anthropic 兼容测试
|
||||
|
||||
检测目标服务对 Anthropic Messages API 的兼容程度。
|
||||
|
||||
**覆盖的 API 端点:**
|
||||
- `GET /v1/models` — 模型列表
|
||||
- `GET /v1/models/{model}` — 模型详情
|
||||
- `POST /v1/messages` — 消息对话
|
||||
- `POST /v1/messages/count_tokens` — Token 计数
|
||||
|
||||
**测试类别:**
|
||||
- **正面用例**:基本对话、system prompt(字符串/数组格式)、多轮对话、assistant prefill、content 数组格式、参数组合(temperature、top_p、top_k、max_tokens、stop_sequences、metadata)
|
||||
- **扩展功能**:`--vision`(图片输入)、`--stream`(流式响应)、`--tools`(工具调用)、`--thinking`(扩展思维)
|
||||
- **负面用例**:缺 header、无效认证、缺参数、空消息、畸形 JSON、非法 role、max_tokens 负数/0、temperature 越界
|
||||
|
||||
**响应验证:**
|
||||
- Models List:检查 `data`、`has_more`、每个模型的 `id`、`type: "model"`、`display_name`、`created_at`
|
||||
- Model Retrieve:检查 `id`、`type: "model"`、`display_name`、`created_at`
|
||||
- Messages:检查 `id`、`type: "message"`、`role: "assistant"`、`content` 数组、`model`、`usage`
|
||||
- Count Tokens:检查 `input_tokens` 为数字
|
||||
|
||||
## 使用方式
|
||||
|
||||
### 基本用法
|
||||
|
||||
```bash
|
||||
# OpenAI 兼容测试
|
||||
python3 scripts/detect_openai.py --base_url http://localhost:9826/v1
|
||||
|
||||
# Anthropic 兼容测试
|
||||
python3 scripts/detect_anthropic.py --base_url http://localhost:9826
|
||||
```
|
||||
|
||||
### 带认证
|
||||
|
||||
```bash
|
||||
python3 scripts/detect_openai.py --base_url http://localhost:9826/v1 --api_key sk-xxx --model gpt-4o
|
||||
|
||||
python3 scripts/detect_anthropic.py --base_url http://localhost:9826 --api_key sk-xxx --model claude-sonnet-4-5
|
||||
```
|
||||
|
||||
### 扩展测试
|
||||
|
||||
```bash
|
||||
# 开启所有扩展测试
|
||||
python3 scripts/detect_openai.py --base_url http://localhost:9826/v1 --all
|
||||
|
||||
python3 scripts/detect_anthropic.py --base_url http://localhost:9826 --all
|
||||
|
||||
# 单独开启某项
|
||||
python3 scripts/detect_openai.py --base_url http://localhost:9826/v1 --stream --tools
|
||||
|
||||
python3 scripts/detect_anthropic.py --base_url http://localhost:9826 --stream --tools --thinking
|
||||
```
|
||||
|
||||
### 命令行参数
|
||||
|
||||
| 参数 | 说明 | 默认值 |
|
||||
|------|------|--------|
|
||||
| `--base_url` | API 基础地址(必填) | — |
|
||||
| `--api_key` | API 密钥 | 空 |
|
||||
| `--model` | 测试使用的模型名称 | `gpt-4o` / `claude-sonnet-4-5` |
|
||||
| `--vision` | 执行视觉相关测试 | 关闭 |
|
||||
| `--stream` | 执行流式响应测试 | 关闭 |
|
||||
| `--tools` | 执行工具调用测试 | 关闭 |
|
||||
| `--logprobs` | 执行 logprobs 测试(仅 OpenAI) | 关闭 |
|
||||
| `--json_schema` | 执行 Structured Output 测试(仅 OpenAI) | 关闭 |
|
||||
| `--thinking` | 执行扩展思维测试(仅 Anthropic) | 关闭 |
|
||||
| `--all` | 开启所有扩展测试 | 关闭 |
|
||||
|
||||
## 输出示例
|
||||
|
||||
```
|
||||
Anthropic 兼容性测试
|
||||
目标: http://localhost:9826
|
||||
模型: claude-sonnet-4-5
|
||||
时间: 2026-04-21 10:30:00
|
||||
用例: 35 个 | 扩展: stream, tools
|
||||
|
||||
[1/35] 获取模型列表 (GET /v1/models)
|
||||
|
||||
URL: GET http://localhost:9826/v1/models
|
||||
|
||||
Headers:
|
||||
x-api-key: sk-xxx
|
||||
anthropic-version: 2023-06-01
|
||||
|
||||
响应 (200, 0.12s):
|
||||
{
|
||||
"data": [...],
|
||||
"has_more": false
|
||||
}
|
||||
✓ 响应验证通过
|
||||
|
||||
[5/35] 基本对话(仅 user)
|
||||
|
||||
URL: POST http://localhost:9826/v1/messages
|
||||
|
||||
Headers:
|
||||
x-api-key: sk-xxx
|
||||
Content-Type: application/json
|
||||
|
||||
入参:
|
||||
{
|
||||
"model": "claude-sonnet-4-5",
|
||||
"max_tokens": 5,
|
||||
"messages": [{"role": "user", "content": "Hi"}]
|
||||
}
|
||||
|
||||
响应 (200, 0.23s):
|
||||
{
|
||||
"id": "msg_xxx",
|
||||
"type": "message",
|
||||
"role": "assistant",
|
||||
"content": [...],
|
||||
"model": "claude-sonnet-4-5",
|
||||
"usage": {"input_tokens": 10, "output_tokens": 5}
|
||||
}
|
||||
✓ 响应验证通过
|
||||
|
||||
测试完成 | 总计: 35 | 成功: 33 | 客户端错误: 2 | 服务端错误: 0 | 网络错误: 0
|
||||
```
|
||||
|
||||
## 测试设计原则
|
||||
|
||||
1. **所有正面用例都启用响应验证器** — 任何响应结构偏差都会立即暴露,避免掩盖错误
|
||||
2. **负面用例覆盖常见错误场景** — 缺参数、类型错误、范围越界、认证失败
|
||||
3. **扩展功能通过 flag 按需开启** — 避免在基础测试中引入不必要的依赖
|
||||
4. **验证器基于协议规范编写** — 严格检查必需字段、类型、枚举值
|
||||
5. **流式与非流式覆盖一致** — 流式只是传输方式不同,功能覆盖范围应完全对应(见下文)
|
||||
|
||||
## 新增检测脚本开发流程
|
||||
|
||||
如需为新的协议(如 Google Gemini、Cohere 等)开发检测脚本,遵循以下流程:
|
||||
|
||||
### 1. 在新脚本中定义协议专用的验证函数
|
||||
|
||||
每个协议的响应结构是独特的,验证函数应定义在各自的脚本中,不要放入 `core.py`。例如:
|
||||
|
||||
```python
|
||||
# 在 detect_gemini.py 中
|
||||
def validate_gemini_generate_content_response(response_text: str) -> Tuple[bool, List[str]]:
|
||||
"""验证 Gemini GenerateContent 响应"""
|
||||
errors = []
|
||||
try:
|
||||
data = json.loads(response_text)
|
||||
except json.JSONDecodeError as e:
|
||||
return False, [f"响应不是有效的JSON: {e}"]
|
||||
|
||||
# 检查 Gemini 特有的字段
|
||||
required_fields = ["candidates", "usageMetadata"]
|
||||
for field in required_fields:
|
||||
if field not in data:
|
||||
errors.append(f"缺少必需字段: {field}")
|
||||
...
|
||||
return len(errors) == 0, errors
|
||||
```
|
||||
|
||||
### 2. 在 `core.py` 中只添加通用验证辅助
|
||||
|
||||
只有当多个协议都需要相同的验证逻辑时,才将函数提取到 `core.py`。目前已有的通用函数:
|
||||
|
||||
| 函数 | 说明 |
|
||||
|------|------|
|
||||
| `check_required_fields()` | 检查必需字段是否存在 |
|
||||
| `check_field_type()` | 检查字段类型 |
|
||||
| `check_enum_value()` | 检查枚举值 |
|
||||
| `check_array_items_type()` | 检查数组元素类型 |
|
||||
| `validate_response_structure()` | 组合上述函数的通用验证器 |
|
||||
| `parse_sse_events()` | 从 SSE 响应文本中提取 `data:` 事件 |
|
||||
|
||||
### 3. 创建检测脚本模板
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
"""新协议兼容性接口测试脚本"""
|
||||
|
||||
import json
|
||||
import argparse
|
||||
from typing import Dict, List, Tuple, Any
|
||||
from core import (
|
||||
create_ssl_context,
|
||||
TestCase,
|
||||
run_test_suite,
|
||||
validate_response_structure,
|
||||
)
|
||||
|
||||
def build_headers(api_key: str) -> Dict[str, str]:
|
||||
"""构建请求头"""
|
||||
...
|
||||
|
||||
def validate_xxx_response(response_text: str) -> Tuple[bool, List[str]]:
|
||||
"""验证响应结构(协议专用)"""
|
||||
...
|
||||
|
||||
def validate_xxx_streaming_response(response_text: str) -> Tuple[bool, List[str]]:
|
||||
"""验证流式响应结构(协议专用)"""
|
||||
from core import parse_sse_events
|
||||
...
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(...)
|
||||
parser.add_argument("--base_url", required=True, ...)
|
||||
parser.add_argument("--api_key", default="", ...)
|
||||
parser.add_argument("--model", default="...", ...)
|
||||
parser.add_argument("--stream", action="store_true", ...)
|
||||
parser.add_argument("--all", action="store_true", ...)
|
||||
args = parser.parse_args()
|
||||
|
||||
cases: List[TestCase] = []
|
||||
|
||||
# ---- 共享定义(供流式和非流式用例共同使用)----
|
||||
# 将 tool、image_url 等定义放在所有功能块之前,
|
||||
# 避免流式和非流式块中重复定义
|
||||
tool_xxx = { ... }
|
||||
image_url = "..."
|
||||
|
||||
# ==== 非流式正面用例(都添加 validator)====
|
||||
cases.append(TestCase(
|
||||
desc="...", method="...", url=..., headers=..., body=...,
|
||||
validator=validate_xxx_response
|
||||
))
|
||||
|
||||
# ==== 非流式负面用例(不添加 validator)====
|
||||
cases.append(TestCase(desc="...", method="...", url=..., headers=..., body=...))
|
||||
|
||||
# ==== --stream ====
|
||||
if args.stream:
|
||||
# 核心对话流式用例:每个非流式正面用例都应有对应的流式版本
|
||||
# 仅传输方式不同(stream=True, stream=True),
|
||||
# 功能覆盖(参数、角色、多轮等)必须与非流式一致
|
||||
cases.append(TestCase(
|
||||
desc="流式...", method="POST", url=..., headers=headers,
|
||||
body={ ..., "stream": True },
|
||||
stream=True,
|
||||
validator=validate_xxx_streaming_response
|
||||
))
|
||||
|
||||
# 流式 + 其他 flag 组合(放在 --stream 块内部)
|
||||
if args.vision:
|
||||
cases.append(TestCase(
|
||||
desc="流式图片输入 (--stream + --vision)",
|
||||
...,
|
||||
stream=True,
|
||||
validator=validate_xxx_streaming_response
|
||||
))
|
||||
if args.tools:
|
||||
cases.append(TestCase(
|
||||
desc="流式工具调用 (--stream + --tools)",
|
||||
...,
|
||||
stream=True,
|
||||
validator=validate_xxx_streaming_response
|
||||
))
|
||||
|
||||
run_test_suite(cases=cases, ssl_ctx=ssl_ctx, title="...", base_url=..., model=..., flags=...)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
```
|
||||
|
||||
### 关键要点
|
||||
|
||||
- **协议专用验证函数放在各自的脚本中** — 不要污染 `core.py`
|
||||
- **只有多协议通用的验证逻辑才提取到 `core.py`** — 遵循 DRY 原则但不过度抽象
|
||||
- **所有正面用例必须添加 validator** — 确保响应结构正确
|
||||
- **负面用例不添加 validator** — 预期返回错误响应
|
||||
- **扩展功能用 flag 控制** — 保持基础测试轻量
|
||||
- **遵循现有命名和代码风格** — 中文注释、类型注解、dataclass 使用
|
||||
|
||||
### 流式测试覆盖原则
|
||||
|
||||
流式(SSE)与非流式只是数据传输方式不同,服务端对请求参数的处理逻辑应完全一致。因此:
|
||||
|
||||
1. **每个非流式正面用例都应有对应的流式版本** — 包括不同的消息角色组合、参数组合、工具调用等
|
||||
2. **共享定义提前声明** — `tool`、`image_url`、`json_schema` 等定义放在所有功能块之前,流式和非流式共用同一实例,避免重复定义
|
||||
3. **flag 组合放在 `--stream` 块内部** — 流式+工具、流式+视觉等组合用例放在 `if args.stream:` 内部的 `if args.tools:` / `if args.vision:` 子块中,不需要单独的组合 flag
|
||||
4. **负面用例不需要流式版本** — 参数校验发生在请求处理之前,与传输方式无关
|
||||
5. **Models API 等非 Chat 端点不需要流式测试** — 它们本身不支持流式传输
|
||||
|
||||
| 用例类别 | 非流式 | 流式 |
|
||||
|----------|--------|------|
|
||||
| 基本对话 / 多轮对话 | ✓ | ✓ |
|
||||
| 消息角色组合(system, developer 等) | ✓ | ✓ |
|
||||
| 参数组合(temperature, top_p, max_tokens 等) | ✓ | ✓ |
|
||||
| 工具调用(tool_choice 各模式) | ✓ | ✓(在 `--stream` 块内检查 `--tools`) |
|
||||
| 视觉(图片输入) | ✓ | ✓(在 `--stream` 块内检查 `--vision`) |
|
||||
| 扩展思维 / Logprobs 等特性 | ✓ | ✓(在 `--stream` 块内检查对应 flag) |
|
||||
| 高级参数(service_tier, reasoning_effort 等) | ✓ | ✓ |
|
||||
| 负面用例(缺参数、越界、认证失败) | ✓ | ✗(参数校验与传输方式无关) |
|
||||
| Models API(GET 端点) | ✓ | ✗(不支持流式) |
|
||||
|
||||
## 许可证
|
||||
|
||||
MIT
|
||||
494
scripts/detect/core.py
Normal file
494
scripts/detect/core.py
Normal file
@@ -0,0 +1,494 @@
|
||||
#!/usr/bin/env python3
|
||||
"""兼容性测试脚本的核心公共函数
|
||||
|
||||
提供 HTTP 请求、SSL 上下文、JSON 格式化、验证辅助等通用功能。
|
||||
"""
|
||||
|
||||
import json
|
||||
import time
|
||||
import ssl
|
||||
import urllib.request
|
||||
import urllib.error
|
||||
from dataclasses import dataclass
|
||||
from typing import Optional, Dict, Any, Tuple, List, Union, Type
|
||||
from enum import Enum
|
||||
|
||||
TIMEOUT = 30
|
||||
MAX_RETRIES = 2 # 最大重试次数
|
||||
|
||||
|
||||
class ErrorType(Enum):
|
||||
"""错误类型分类"""
|
||||
NETWORK = "network" # 网络错误
|
||||
CLIENT = "client" # 4xx 错误
|
||||
SERVER = "server" # 5xx 错误
|
||||
SUCCESS = "success" # 成功
|
||||
|
||||
|
||||
@dataclass
|
||||
class TestCase:
|
||||
"""测试用例数据结构"""
|
||||
desc: str # 测试描述
|
||||
method: str # HTTP 方法
|
||||
url: str # 请求 URL
|
||||
headers: Dict[str, str] # 请求头
|
||||
body: Optional[Any] = None # 请求体
|
||||
stream: bool = False # 是否流式请求
|
||||
validator: Optional[Any] = None # 响应验证函数(可选)
|
||||
|
||||
|
||||
@dataclass
|
||||
class TestResult:
|
||||
"""测试结果数据结构"""
|
||||
status: Optional[int] # HTTP 状态码
|
||||
elapsed: float # 耗时(秒)
|
||||
error_type: ErrorType # 错误类型
|
||||
response: str # 响应内容
|
||||
|
||||
|
||||
def create_ssl_context() -> ssl.SSLContext:
|
||||
"""创建不验证证书的 SSL 上下文(用于测试环境)"""
|
||||
ctx = ssl.create_default_context()
|
||||
ctx.check_hostname = False
|
||||
ctx.verify_mode = ssl.CERT_NONE
|
||||
return ctx
|
||||
|
||||
|
||||
def classify_error(status: Optional[int]) -> ErrorType:
|
||||
"""根据状态码分类错误类型"""
|
||||
if status is None:
|
||||
return ErrorType.NETWORK
|
||||
if 200 <= status < 300:
|
||||
return ErrorType.SUCCESS
|
||||
if 400 <= status < 500:
|
||||
return ErrorType.CLIENT
|
||||
if status >= 500:
|
||||
return ErrorType.SERVER
|
||||
return ErrorType.NETWORK
|
||||
|
||||
|
||||
def http_request(
|
||||
url: str,
|
||||
method: str = "GET",
|
||||
headers: Optional[Dict[str, str]] = None,
|
||||
body: Optional[Any] = None,
|
||||
ssl_ctx: Optional[ssl.SSLContext] = None,
|
||||
retries: int = MAX_RETRIES
|
||||
) -> TestResult:
|
||||
"""执行普通 HTTP 请求(支持重试)
|
||||
|
||||
Args:
|
||||
url: 请求 URL
|
||||
method: HTTP 方法 (GET/POST/PUT/DELETE)
|
||||
headers: 请求头字典
|
||||
body: 请求体 (dict 或 str)
|
||||
ssl_ctx: SSL 上下文
|
||||
retries: 重试次数
|
||||
|
||||
Returns:
|
||||
TestResult 对象
|
||||
"""
|
||||
req = urllib.request.Request(url, method=method)
|
||||
if headers:
|
||||
for k, v in headers.items():
|
||||
req.add_header(k, v)
|
||||
if body is not None:
|
||||
if isinstance(body, str):
|
||||
req.data = body.encode("utf-8")
|
||||
else:
|
||||
req.data = json.dumps(body).encode("utf-8")
|
||||
|
||||
start = time.time()
|
||||
last_error = None
|
||||
|
||||
for attempt in range(retries + 1):
|
||||
try:
|
||||
resp = urllib.request.urlopen(req, timeout=TIMEOUT, context=ssl_ctx)
|
||||
elapsed = time.time() - start
|
||||
status = resp.getcode()
|
||||
return TestResult(
|
||||
status=status,
|
||||
elapsed=elapsed,
|
||||
error_type=classify_error(status),
|
||||
response=resp.read().decode("utf-8")
|
||||
)
|
||||
except urllib.error.HTTPError as e:
|
||||
elapsed = time.time() - start
|
||||
return TestResult(
|
||||
status=e.code,
|
||||
elapsed=elapsed,
|
||||
error_type=classify_error(e.code),
|
||||
response=e.read().decode("utf-8")
|
||||
)
|
||||
except Exception as e:
|
||||
last_error = str(e)
|
||||
if attempt < retries:
|
||||
time.sleep(0.5 * (attempt + 1)) # 递增延迟
|
||||
continue
|
||||
|
||||
elapsed = time.time() - start
|
||||
return TestResult(
|
||||
status=None,
|
||||
elapsed=elapsed,
|
||||
error_type=ErrorType.NETWORK,
|
||||
response=last_error or "Unknown error"
|
||||
)
|
||||
|
||||
|
||||
def http_stream_request(
|
||||
url: str,
|
||||
headers: Optional[Dict[str, str]] = None,
|
||||
body: Optional[Any] = None,
|
||||
ssl_ctx: Optional[ssl.SSLContext] = None,
|
||||
retries: int = MAX_RETRIES,
|
||||
method: str = "POST"
|
||||
) -> TestResult:
|
||||
"""执行流式 HTTP 请求 (SSE,支持重试)
|
||||
|
||||
Args:
|
||||
url: 请求 URL
|
||||
headers: 请求头字典
|
||||
body: 请求体 (dict)
|
||||
ssl_ctx: SSL 上下文
|
||||
retries: 重试次数
|
||||
method: HTTP 方法 (默认 POST)
|
||||
|
||||
Returns:
|
||||
TestResult 对象
|
||||
"""
|
||||
req = urllib.request.Request(url, method=method)
|
||||
if headers:
|
||||
for k, v in headers.items():
|
||||
req.add_header(k, v)
|
||||
if body is not None:
|
||||
req.data = json.dumps(body).encode("utf-8")
|
||||
|
||||
start = time.time()
|
||||
last_error = None
|
||||
|
||||
for attempt in range(retries + 1):
|
||||
try:
|
||||
resp = urllib.request.urlopen(req, timeout=TIMEOUT, context=ssl_ctx)
|
||||
status = resp.getcode()
|
||||
lines = []
|
||||
for raw_line in resp:
|
||||
line = raw_line.decode("utf-8").rstrip("\n\r")
|
||||
if line:
|
||||
lines.append(line)
|
||||
elapsed = time.time() - start
|
||||
return TestResult(
|
||||
status=status,
|
||||
elapsed=elapsed,
|
||||
error_type=classify_error(status),
|
||||
response="\n".join(lines)
|
||||
)
|
||||
except urllib.error.HTTPError as e:
|
||||
elapsed = time.time() - start
|
||||
return TestResult(
|
||||
status=e.code,
|
||||
elapsed=elapsed,
|
||||
error_type=classify_error(e.code),
|
||||
response=e.read().decode("utf-8")
|
||||
)
|
||||
except Exception as e:
|
||||
last_error = str(e)
|
||||
if attempt < retries:
|
||||
time.sleep(0.5 * (attempt + 1))
|
||||
continue
|
||||
|
||||
elapsed = time.time() - start
|
||||
return TestResult(
|
||||
status=None,
|
||||
elapsed=elapsed,
|
||||
error_type=ErrorType.NETWORK,
|
||||
response=last_error or "Unknown error"
|
||||
)
|
||||
|
||||
|
||||
def parse_sse_events(response_text: str) -> List[str]:
|
||||
"""从 SSE 响应文本中解析出所有 data 事件的数据。
|
||||
|
||||
Args:
|
||||
response_text: SSE 响应的原始文本
|
||||
|
||||
Returns:
|
||||
data 字段内容的列表(已跳过 [DONE])
|
||||
"""
|
||||
events = []
|
||||
for line in response_text.split("\n"):
|
||||
line = line.strip()
|
||||
if line.startswith("data:"):
|
||||
data = line[len("data:"):].strip()
|
||||
if data and data != "[DONE]":
|
||||
events.append(data)
|
||||
return events
|
||||
|
||||
|
||||
def format_json(text: str) -> str:
|
||||
"""格式化 JSON 文本(用于美化输出)
|
||||
|
||||
Args:
|
||||
text: JSON 字符串或任意文本
|
||||
|
||||
Returns:
|
||||
格式化后的 JSON 字符串,或原文本(如果不是有效 JSON)
|
||||
"""
|
||||
try:
|
||||
parsed = json.loads(text)
|
||||
return json.dumps(parsed, ensure_ascii=False, indent=2)
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
return text
|
||||
|
||||
|
||||
def run_test(
|
||||
index: int,
|
||||
total: int,
|
||||
test_case: TestCase,
|
||||
ssl_ctx: ssl.SSLContext
|
||||
) -> TestResult:
|
||||
"""执行单个测试用例并打印结果
|
||||
|
||||
Args:
|
||||
index: 测试序号
|
||||
total: 总测试数
|
||||
test_case: 测试用例对象
|
||||
ssl_ctx: SSL 上下文
|
||||
|
||||
Returns:
|
||||
TestResult 对象
|
||||
"""
|
||||
print(f"\n[{index}/{total}] {test_case.desc}")
|
||||
print(f"\nURL: {test_case.method} {test_case.url}")
|
||||
|
||||
if test_case.headers:
|
||||
print("\nHeaders:")
|
||||
for k, v in test_case.headers.items():
|
||||
print(f" {k}: {v}")
|
||||
|
||||
if test_case.body is not None:
|
||||
print("\n入参:")
|
||||
if isinstance(test_case.body, str):
|
||||
print(test_case.body)
|
||||
else:
|
||||
print(format_json(json.dumps(test_case.body, ensure_ascii=False)))
|
||||
|
||||
if test_case.stream:
|
||||
result = http_stream_request(
|
||||
test_case.url,
|
||||
test_case.headers,
|
||||
test_case.body,
|
||||
ssl_ctx
|
||||
)
|
||||
else:
|
||||
result = http_request(
|
||||
test_case.url,
|
||||
test_case.method,
|
||||
test_case.headers,
|
||||
test_case.body,
|
||||
ssl_ctx
|
||||
)
|
||||
|
||||
if result.status is not None:
|
||||
print(f"\n响应 ({result.status}, {result.elapsed:.2f}s):")
|
||||
else:
|
||||
print(f"\n请求失败 ({result.elapsed:.2f}s):")
|
||||
|
||||
if test_case.stream and result.status and result.status < 300:
|
||||
for line in result.response.split("\n"):
|
||||
print(line)
|
||||
else:
|
||||
print(format_json(result.response))
|
||||
|
||||
if test_case.validator and result.status and 200 <= result.status < 300:
|
||||
is_valid, errors = test_case.validator(result.response)
|
||||
if is_valid:
|
||||
print("✓ 响应验证通过")
|
||||
else:
|
||||
print("✗ 响应验证失败:")
|
||||
for error in errors:
|
||||
print(f" - {error}")
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def run_test_suite(
|
||||
cases: List[TestCase],
|
||||
ssl_ctx: ssl.SSLContext,
|
||||
title: str,
|
||||
base_url: str,
|
||||
model: str,
|
||||
flags: Optional[List[str]] = None
|
||||
) -> Tuple[int, int, int, int]:
|
||||
"""执行测试套件并打印总结
|
||||
|
||||
Args:
|
||||
cases: 测试用例列表
|
||||
ssl_ctx: SSL 上下文
|
||||
title: 测试标题
|
||||
base_url: API 基础地址
|
||||
model: 模型名称
|
||||
flags: 扩展测试标记列表
|
||||
|
||||
Returns:
|
||||
(总数, 成功数, 客户端错误数, 服务端错误数)
|
||||
"""
|
||||
total = len(cases)
|
||||
count_success = 0
|
||||
count_client_error = 0
|
||||
count_server_error = 0
|
||||
count_network_error = 0
|
||||
|
||||
print(f"\n{title}")
|
||||
print(f"目标: {base_url}")
|
||||
print(f"模型: {model}")
|
||||
print(f"时间: {time.strftime('%Y-%m-%d %H:%M:%S')}")
|
||||
if flags:
|
||||
print(f"用例: {total} 个 | 扩展: {', '.join(flags)}")
|
||||
else:
|
||||
print(f"用例: {total} 个")
|
||||
print()
|
||||
|
||||
for i, test_case in enumerate(cases, 1):
|
||||
result = run_test(i, total, test_case, ssl_ctx)
|
||||
|
||||
if result.error_type == ErrorType.SUCCESS:
|
||||
count_success += 1
|
||||
elif result.error_type == ErrorType.CLIENT:
|
||||
count_client_error += 1
|
||||
elif result.error_type == ErrorType.SERVER:
|
||||
count_server_error += 1
|
||||
else:
|
||||
count_network_error += 1
|
||||
|
||||
print()
|
||||
print(f"测试完成 | 总计: {total} | 成功: {count_success} | "
|
||||
f"客户端错误: {count_client_error} | 服务端错误: {count_server_error} | "
|
||||
f"网络错误: {count_network_error}")
|
||||
|
||||
return total, count_success, count_client_error, count_server_error
|
||||
|
||||
|
||||
# ==================== 通用验证辅助函数 ====================
|
||||
|
||||
def check_required_fields(data: Dict[str, Any], required_fields: List[str]) -> Tuple[bool, List[str]]:
|
||||
"""检查必需字段是否存在
|
||||
|
||||
Args:
|
||||
data: 待检查的数据字典
|
||||
required_fields: 必需字段列表
|
||||
|
||||
Returns:
|
||||
(是否全部存在, 缺失字段列表)
|
||||
"""
|
||||
missing = []
|
||||
for field in required_fields:
|
||||
if field not in data:
|
||||
missing.append(field)
|
||||
return len(missing) == 0, missing
|
||||
|
||||
|
||||
def check_field_type(value: Any, expected_type: Union[Type, tuple]) -> bool:
|
||||
"""检查字段类型是否正确
|
||||
|
||||
Args:
|
||||
value: 待检查的值
|
||||
expected_type: 期望的类型(可以是类型元组)
|
||||
|
||||
Returns:
|
||||
类型是否匹配
|
||||
"""
|
||||
if value is None:
|
||||
return True # None值通常表示可选字段,允许
|
||||
return isinstance(value, expected_type)
|
||||
|
||||
|
||||
def check_enum_value(value: Any, allowed_values: List[Any]) -> bool:
|
||||
"""检查值是否在允许的枚举值列表中
|
||||
|
||||
Args:
|
||||
value: 待检查的值
|
||||
allowed_values: 允许的值列表
|
||||
|
||||
Returns:
|
||||
值是否合法
|
||||
"""
|
||||
if value is None:
|
||||
return True # None值通常表示可选字段,允许
|
||||
return value in allowed_values
|
||||
|
||||
|
||||
def check_array_items_type(arr: List[Any], expected_item_type: Union[Type, tuple]) -> bool:
|
||||
"""检查数组中所有元素的类型
|
||||
|
||||
Args:
|
||||
arr: 待检查的数组
|
||||
expected_item_type: 期望的元素类型
|
||||
|
||||
Returns:
|
||||
所有元素类型是否匹配
|
||||
"""
|
||||
if not isinstance(arr, list):
|
||||
return False
|
||||
return all(check_field_type(item, expected_item_type) for item in arr)
|
||||
|
||||
|
||||
def format_validation_errors(errors: List[str]) -> str:
|
||||
"""格式化验证错误信息
|
||||
|
||||
Args:
|
||||
errors: 错误信息列表
|
||||
|
||||
Returns:
|
||||
格式化后的错误字符串
|
||||
"""
|
||||
if not errors:
|
||||
return "验证通过"
|
||||
return "验证失败:\n - " + "\n - ".join(errors)
|
||||
|
||||
|
||||
def validate_response_structure(
|
||||
response_text: str,
|
||||
required_fields: List[str],
|
||||
field_types: Optional[Dict[str, Union[Type, tuple]]] = None,
|
||||
enum_values: Optional[Dict[str, List[Any]]] = None
|
||||
) -> Tuple[bool, List[str]]:
|
||||
"""验证响应结构(通用验证函数)
|
||||
|
||||
Args:
|
||||
response_text: 响应文本
|
||||
required_fields: 必需字段列表
|
||||
field_types: 字段类型映射 {字段名: 期望类型}
|
||||
enum_values: 枚举值映射 {字段名: 允许值列表}
|
||||
|
||||
Returns:
|
||||
(是否验证通过, 错误信息列表)
|
||||
"""
|
||||
errors = []
|
||||
|
||||
# 尝试解析JSON
|
||||
try:
|
||||
data = json.loads(response_text)
|
||||
except json.JSONDecodeError as e:
|
||||
errors.append(f"响应不是有效的JSON: {e}")
|
||||
return False, errors
|
||||
|
||||
# 检查必需字段
|
||||
has_required, missing = check_required_fields(data, required_fields)
|
||||
if not has_required:
|
||||
errors.append(f"缺少必需字段: {', '.join(missing)}")
|
||||
|
||||
# 检查字段类型
|
||||
if field_types:
|
||||
for field, expected_type in field_types.items():
|
||||
if field in data and not check_field_type(data[field], expected_type):
|
||||
actual_type = type(data[field]).__name__
|
||||
expected_name = expected_type.__name__ if isinstance(expected_type, type) else str(expected_type)
|
||||
errors.append(f"字段 '{field}' 类型错误: 期望 {expected_name}, 实际 {actual_type}")
|
||||
|
||||
# 检查枚举值
|
||||
if enum_values:
|
||||
for field, allowed in enum_values.items():
|
||||
if field in data and not check_enum_value(data[field], allowed):
|
||||
errors.append(f"字段 '{field}' 值非法: {data[field]}, 允许值: {allowed}")
|
||||
|
||||
return len(errors) == 0, errors
|
||||
1159
scripts/detect/detect_anthropic.py
Normal file
1159
scripts/detect/detect_anthropic.py
Normal file
File diff suppressed because it is too large
Load Diff
1199
scripts/detect/detect_openai.py
Executable file
1199
scripts/detect/detect_openai.py
Executable file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user