fix: 优化配置、修复测试和 temp_pdf 中文字体支持
- 优化 config.py,为所有依赖添加版本号,为所有文件类型添加 Darwin-x86_64 配置 - 修改 run_tests.py,添加平台相关 TEST_FIXTURE_DEPENDENCIES,简化 cli 和 all 测试逻辑 - 修复 tests/conftest.py 中 temp_pdf 的中文字体支持,使用 macOS 系统字体 - 更新 tests/test_core/test_advice_generator.py 以适应 Python 3.12 的默认配置 - 更新 openspec 相关规格文档
This commit is contained in:
@@ -13,6 +13,7 @@ context: |
|
||||
- 代码: 模块文件150-300行; 错误需自定义异常+清晰信息+位置上下文
|
||||
- 项目阶段: 未上线,无用户,破坏性变更无需迁移说明
|
||||
- Git提交: 仅中文; 格式为"类型: 简短描述",类型可选: feat(新功能)/fix(修复)/refactor(重构)/docs(文档)/style(格式)/test(测试)/chore(构建/工具); 多行描述空行后加详细说明
|
||||
- 提问: 对用户的提问优先使用提问工具而不是文字选项
|
||||
# 项目概述
|
||||
- 目标:统一文档解析工具,将DOCX/XLSX/PPTX/PDF/HTML/URL 转换为 Markdown,面向AI skill使用
|
||||
# 项目目录结构
|
||||
|
||||
@@ -22,12 +22,42 @@
|
||||
- 必须使用 Python 3.12
|
||||
- `docling-parse` 5.x 无 x86_64 wheel,必须使用 4.0.0
|
||||
- 提供完整的 `uv run --python 3.12 --with "docling==2.40.0" --with "docling-parse==4.0.0" --with "numpy<2" ...` 命令示例
|
||||
- unstructured 在 Darwin-x86_64 平台不可用,已从配置中移除
|
||||
|
||||
#### Scenario: 每个平台的运行命令
|
||||
- **WHEN** 用户阅读 SKILL.md
|
||||
- **THEN** 系统必须为每个平台(Windows/macOS Intel/macOS ARM/Linux)和每种文档格式提供清晰的 `uv run --with` 命令示例
|
||||
- **AND** 命令必须包含所有必需的依赖包
|
||||
|
||||
### Requirement: 依赖配置结构
|
||||
config.py 中的 DEPENDENCIES 配置使用字典结构,保持简单直接以便于在不同平台进行细致调整。
|
||||
|
||||
#### Scenario: 配置数据格式不变
|
||||
- **WHEN** 代码访问 config.DEPENDENCIES["pdf"]["default"]
|
||||
- **THEN** 返回的数据结构保持不变
|
||||
- **AND** 包含 "python" 和 "dependencies" 字段
|
||||
|
||||
#### Scenario: 所有文件类型都有 Darwin-x86_64 配置
|
||||
- **WHEN** 查看 config.DEPENDENCIES
|
||||
- **THEN** pdf/docx/xlsx/pptx/xls/ppt 都有 "Darwin-x86_64" 平台配置
|
||||
- **AND** Darwin-x86_64 配置中不包含 unstructured 相关依赖
|
||||
|
||||
### Requirement: 依赖版本管理
|
||||
所有依赖必须指定版本号,default 平台使用截止 2026-03-17 的最新版本,Darwin-x86_64 平台使用已验证可用的版本。
|
||||
|
||||
#### Scenario: default 平台使用最新版本
|
||||
- **WHEN** 查看 config.DEPENDENCIES 中 default 配置的依赖
|
||||
- **THEN** 所有依赖都有明确的版本号
|
||||
- **AND** docling 使用 2.80.0
|
||||
- **AND** docling-parse 使用 5.5.0
|
||||
- **AND** markitdown 使用 0.1.5
|
||||
|
||||
#### Scenario: Darwin-x86_64 平台使用验证版本
|
||||
- **WHEN** 查看 config.DEPENDENCIES 中 Darwin-x86_64 配置的依赖
|
||||
- **THEN** docling 使用 2.40.0
|
||||
- **AND** docling-parse 使用 4.0.0
|
||||
- **AND** numpy 使用 <2
|
||||
|
||||
### Requirement: 平台检测文档
|
||||
系统必须在 `SKILL.md` 中提供平台检测方法和平台特定的安装指南。
|
||||
|
||||
|
||||
@@ -6,6 +6,23 @@
|
||||
|
||||
## Requirements
|
||||
|
||||
### Requirement: 测试运行器包含 fixtures 依赖
|
||||
run_tests.py 必须定义 TEST_FIXTURE_DEPENDENCIES 常量,包含创建临时测试文件所需的所有依赖。
|
||||
|
||||
#### Scenario: TEST_FIXTURE_DEPENDENCIES 定义存在
|
||||
- **WHEN** 查看 run_tests.py
|
||||
- **THEN** 存在 TEST_FIXTURE_DEPENDENCIES 常量
|
||||
- **AND** 包含 python-docx(用于创建临时 DOCX)
|
||||
- **AND** 包含 reportlab(用于创建临时 PDF)
|
||||
- **AND** 包含 pandas(用于创建临时 XLSX)
|
||||
- **AND** 包含 openpyxl(pandas 写 XLSX 需要)
|
||||
- **AND** 包含 python-pptx(用于创建临时 PPTX)
|
||||
|
||||
#### Scenario: fixtures 依赖与文件类型依赖合并
|
||||
- **WHEN** 运行任何类型的测试
|
||||
- **THEN** TEST_FIXTURE_DEPENDENCIES 中的依赖自动合并到 uv run --with 参数中
|
||||
- **AND** 去重处理,避免重复添加
|
||||
|
||||
### Requirement: 临时文件自动清理
|
||||
测试使用的临时文件 MUST 在测试完成后自动清理,使用 pytest 的 tmp_path fixture。
|
||||
|
||||
|
||||
@@ -12,21 +12,26 @@
|
||||
#### Scenario: 运行 PDF 测试
|
||||
- **WHEN** 用户执行 `python run_tests.py pdf`
|
||||
- **THEN** 自动加载 config.DEPENDENCIES["pdf"] 中的依赖
|
||||
- **AND** 自动加载测试 fixtures 所需的依赖
|
||||
- **AND** 运行 tests/test_readers/test_pdf/ 目录下的测试
|
||||
|
||||
#### Scenario: 运行 DOCX 测试
|
||||
- **WHEN** 用户执行 `python run_tests.py docx`
|
||||
- **THEN** 自动加载 config.DEPENDENCIES["docx"] 中的依赖
|
||||
- **AND** 自动加载测试 fixtures 所需的依赖
|
||||
- **AND** 运行 tests/test_readers/test_docx/ 目录下的测试
|
||||
|
||||
#### Scenario: 运行 CLI 测试(无特殊依赖)
|
||||
- **WHEN** 用户执行 `python run_tests.py cli`
|
||||
- **THEN** 仅加载 pytest 依赖
|
||||
- **THEN** 加载 pytest 依赖
|
||||
- **AND** 自动加载测试 fixtures 所需的依赖
|
||||
- **AND** 加载 config.DEPENDENCIES 中所有类型的依赖(去重)
|
||||
- **AND** 运行 tests/test_cli/ 目录下的测试
|
||||
|
||||
#### Scenario: 运行所有测试
|
||||
- **WHEN** 用户执行 `python run_tests.py all`
|
||||
- **THEN** 加载 config.DEPENDENCIES 中所有类型的依赖(去重)
|
||||
- **AND** 自动加载测试 fixtures 所需的依赖
|
||||
- **AND** 运行 tests/ 目录下的所有测试
|
||||
|
||||
### Requirement: 测试运行器支持透传 pytest 参数
|
||||
|
||||
112
run_tests.py
112
run_tests.py
@@ -23,6 +23,24 @@ os.environ["HF_HUB_DISABLE_PROGRESS_BARS"] = "1"
|
||||
os.environ["HF_HUB_DISABLE_TELEMETRY"] = "1"
|
||||
os.environ["TQDM_DISABLE"] = "1"
|
||||
|
||||
# 测试 fixtures 需要的依赖(用于创建临时测试文件)
|
||||
TEST_FIXTURE_DEPENDENCIES = {
|
||||
"default": [
|
||||
"python-docx==1.2.0", # 用于创建临时 DOCX
|
||||
"reportlab==4.2.2", # 用于创建临时 PDF
|
||||
"pandas==3.0.1", # 用于创建临时 XLSX
|
||||
"openpyxl==3.1.5", # pandas 写 XLSX 需要
|
||||
"python-pptx==1.0.2", # 用于创建临时 PPTX
|
||||
],
|
||||
"Darwin-x86_64": [
|
||||
"python-docx==1.2.0", # 用于创建临时 DOCX
|
||||
"reportlab==4.2.2", # 用于创建临时 PDF
|
||||
"pandas<3.0.0", # 用于创建临时 XLSX(兼容 Darwin-x86_64)
|
||||
"openpyxl==3.1.5", # pandas 写 XLSX 需要
|
||||
"python-pptx==1.0.2", # 用于创建临时 PPTX
|
||||
],
|
||||
}
|
||||
|
||||
# 测试类型映射
|
||||
_TEST_TYPES = {
|
||||
# 文件类型测试(有依赖配置)
|
||||
@@ -34,8 +52,8 @@ _TEST_TYPES = {
|
||||
"xls": {"key": "xls", "path": "tests/test_readers/test_xls/"},
|
||||
"doc": {"key": "doc", "path": "tests/test_readers/test_doc/"},
|
||||
"ppt": {"key": "ppt", "path": "tests/test_readers/test_ppt/"},
|
||||
# 核心测试(无特殊依赖)
|
||||
"cli": {"key": None, "path": "tests/test_cli/"},
|
||||
# 核心测试(cli 测试需要所有依赖,因为它测试多种格式)
|
||||
"cli": {"key": "all", "path": "tests/test_cli/"},
|
||||
"core": {"key": None, "path": "tests/test_core/"},
|
||||
"utils": {"key": None, "path": "tests/test_utils/"},
|
||||
# 所有测试(合并所有依赖)
|
||||
@@ -43,9 +61,40 @@ _TEST_TYPES = {
|
||||
}
|
||||
|
||||
|
||||
def _collect_all_dependencies(platform_id: str):
|
||||
"""
|
||||
收集所有文件类型的依赖并去重(内部辅助函数)。
|
||||
|
||||
Args:
|
||||
platform_id: 平台标识
|
||||
|
||||
Returns:
|
||||
(python_version, dependencies) 元组
|
||||
"""
|
||||
from config import DEPENDENCIES
|
||||
|
||||
python_version = None
|
||||
all_deps = set()
|
||||
for type_key, type_config in DEPENDENCIES.items():
|
||||
# 先尝试特定平台配置
|
||||
if platform_id in type_config:
|
||||
cfg = type_config[platform_id]
|
||||
elif "default" in type_config:
|
||||
cfg = type_config["default"]
|
||||
else:
|
||||
continue
|
||||
# 记录 python 版本(优先使用有特殊要求的)
|
||||
if cfg.get("python") and not python_version:
|
||||
python_version = cfg["python"]
|
||||
# 收集依赖
|
||||
for dep in cfg.get("dependencies", []):
|
||||
all_deps.add(dep)
|
||||
return python_version, list(all_deps)
|
||||
|
||||
|
||||
def get_dependencies_for_type(test_type: str, platform_id: str):
|
||||
"""
|
||||
获取指定测试类型的依赖配置。
|
||||
获取指定测试类型的依赖配置(完全从 config.py 获取)。
|
||||
|
||||
Args:
|
||||
test_type: 测试类型(pdf/docx/.../all)
|
||||
@@ -63,30 +112,14 @@ def get_dependencies_for_type(test_type: str, platform_id: str):
|
||||
key = config["key"]
|
||||
|
||||
if key is None:
|
||||
# 无特殊依赖的测试类型(cli/core/utils)
|
||||
# core/utils 测试不需要特殊依赖
|
||||
return None, []
|
||||
|
||||
if key == "all":
|
||||
# 收集所有类型的依赖并去重
|
||||
python_version = None
|
||||
all_deps = set()
|
||||
for type_key, type_config in DEPENDENCIES.items():
|
||||
# 先尝试特定平台配置
|
||||
if platform_id in type_config:
|
||||
cfg = type_config[platform_id]
|
||||
elif "default" in type_config:
|
||||
cfg = type_config["default"]
|
||||
else:
|
||||
continue
|
||||
# 记录 python 版本(优先使用有特殊要求的)
|
||||
if cfg.get("python"):
|
||||
python_version = cfg["python"]
|
||||
# 收集依赖
|
||||
for dep in cfg.get("dependencies", []):
|
||||
all_deps.add(dep)
|
||||
return python_version, list(all_deps)
|
||||
# cli 和 all 都使用收集所有依赖的逻辑
|
||||
return _collect_all_dependencies(platform_id)
|
||||
|
||||
# 单个类型的依赖
|
||||
# 单个类型的依赖,完全从 config.py 获取
|
||||
if key not in DEPENDENCIES:
|
||||
return None, []
|
||||
|
||||
@@ -101,11 +134,30 @@ def get_dependencies_for_type(test_type: str, platform_id: str):
|
||||
return cfg.get("python"), cfg.get("dependencies", [])
|
||||
|
||||
|
||||
def get_fixture_dependencies(platform_id: str):
|
||||
"""
|
||||
获取指定平台的 fixtures 依赖。
|
||||
|
||||
Args:
|
||||
platform_id: 平台标识
|
||||
|
||||
Returns:
|
||||
list: fixtures 依赖列表
|
||||
"""
|
||||
if platform_id in TEST_FIXTURE_DEPENDENCIES:
|
||||
return TEST_FIXTURE_DEPENDENCIES[platform_id]
|
||||
elif "default" in TEST_FIXTURE_DEPENDENCIES:
|
||||
return TEST_FIXTURE_DEPENDENCIES["default"]
|
||||
else:
|
||||
return []
|
||||
|
||||
|
||||
def generate_uv_args(
|
||||
dependencies: list,
|
||||
test_path: str,
|
||||
pytest_args: list,
|
||||
python_version: str = None,
|
||||
platform_id: str = None,
|
||||
):
|
||||
"""
|
||||
生成 uv run 命令参数列表(用于 subprocess.run)。
|
||||
@@ -115,6 +167,7 @@ def generate_uv_args(
|
||||
test_path: 测试路径
|
||||
pytest_args: 透传给 pytest 的参数
|
||||
python_version: 需要的 python 版本,None 表示不指定
|
||||
platform_id: 平台标识,用于选择 fixtures 依赖
|
||||
|
||||
Returns:
|
||||
uv run 命令参数列表
|
||||
@@ -127,8 +180,18 @@ def generate_uv_args(
|
||||
# 添加 pytest
|
||||
args.extend(["--with", "pytest"])
|
||||
|
||||
# 添加其他依赖
|
||||
# 获取当前平台的 fixtures 依赖
|
||||
fixture_deps = get_fixture_dependencies(platform_id) if platform_id else []
|
||||
|
||||
# 合并文件类型依赖和 fixtures 依赖,去重
|
||||
all_deps = set()
|
||||
for dep in dependencies:
|
||||
all_deps.add(dep)
|
||||
for dep in fixture_deps:
|
||||
all_deps.add(dep)
|
||||
|
||||
# 添加所有依赖
|
||||
for dep in sorted(all_deps):
|
||||
args.extend(["--with", dep])
|
||||
|
||||
# 添加 pytest 命令
|
||||
@@ -205,6 +268,7 @@ def main():
|
||||
test_path=test_path,
|
||||
pytest_args=pytest_args,
|
||||
python_version=python_version,
|
||||
platform_id=platform_id,
|
||||
)
|
||||
|
||||
# 设置环境变量
|
||||
|
||||
@@ -24,13 +24,13 @@ class Config:
|
||||
DEPENDENCIES = {
|
||||
"pdf": {
|
||||
"default": {
|
||||
"python": None,
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"docling",
|
||||
"docling==2.80.0",
|
||||
"unstructured[pdf]",
|
||||
"markitdown[pdf]",
|
||||
"pypdf",
|
||||
"markdownify"
|
||||
"markitdown[pdf]==0.1.5",
|
||||
"pypdf==6.9.0",
|
||||
"markdownify==0.13.1"
|
||||
]
|
||||
},
|
||||
"Darwin-x86_64": {
|
||||
@@ -39,94 +39,22 @@ DEPENDENCIES = {
|
||||
"docling==2.40.0",
|
||||
"docling-parse==4.0.0",
|
||||
"numpy<2",
|
||||
"markitdown[pdf]",
|
||||
"pypdf",
|
||||
"markdownify"
|
||||
"markitdown[pdf]==0.1.5",
|
||||
"pypdf==6.9.0",
|
||||
"markdownify==0.13.1"
|
||||
]
|
||||
}
|
||||
},
|
||||
"docx": {
|
||||
"default": {
|
||||
"python": None,
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"docling",
|
||||
"docling==2.80.0",
|
||||
"unstructured[docx]",
|
||||
"markitdown[docx]",
|
||||
"pypandoc-binary",
|
||||
"python-docx",
|
||||
"markdownify"
|
||||
]
|
||||
}
|
||||
},
|
||||
"xlsx": {
|
||||
"default": {
|
||||
"python": None,
|
||||
"dependencies": [
|
||||
"docling",
|
||||
"unstructured[xlsx]",
|
||||
"markitdown[xlsx]",
|
||||
"pandas",
|
||||
"tabulate"
|
||||
]
|
||||
}
|
||||
},
|
||||
"pptx": {
|
||||
"default": {
|
||||
"python": None,
|
||||
"dependencies": [
|
||||
"docling",
|
||||
"unstructured[pptx]",
|
||||
"markitdown[pptx]",
|
||||
"python-pptx",
|
||||
"markdownify"
|
||||
]
|
||||
}
|
||||
},
|
||||
"html": {
|
||||
"default": {
|
||||
"python": None,
|
||||
"dependencies": [
|
||||
"trafilatura",
|
||||
"domscribe",
|
||||
"markitdown",
|
||||
"html2text",
|
||||
"beautifulsoup4",
|
||||
"httpx",
|
||||
"chardet",
|
||||
"pyppeteer",
|
||||
"selenium"
|
||||
]
|
||||
}
|
||||
},
|
||||
"xls": {
|
||||
"default": {
|
||||
"python": None,
|
||||
"dependencies": [
|
||||
"unstructured[xlsx]",
|
||||
"markitdown[xls]",
|
||||
"pandas",
|
||||
"tabulate",
|
||||
"xlrd",
|
||||
"olefile"
|
||||
]
|
||||
}
|
||||
},
|
||||
"doc": {
|
||||
"default": {
|
||||
"python": None,
|
||||
"dependencies": []
|
||||
}
|
||||
},
|
||||
"ppt": {
|
||||
"default": {
|
||||
"python": None,
|
||||
"dependencies": [
|
||||
"docling",
|
||||
"unstructured[pptx]",
|
||||
"markitdown[pptx]",
|
||||
"python-pptx",
|
||||
"markdownify",
|
||||
"olefile"
|
||||
"markitdown[docx]==0.1.5",
|
||||
"pypandoc-binary==1.13",
|
||||
"python-docx==1.2.0",
|
||||
"markdownify==0.13.1"
|
||||
]
|
||||
},
|
||||
"Darwin-x86_64": {
|
||||
@@ -135,10 +63,129 @@ DEPENDENCIES = {
|
||||
"docling==2.40.0",
|
||||
"docling-parse==4.0.0",
|
||||
"numpy<2",
|
||||
"markitdown[pptx]",
|
||||
"python-pptx",
|
||||
"markdownify",
|
||||
"olefile"
|
||||
"markitdown[docx]==0.1.5",
|
||||
"pypandoc-binary==1.13",
|
||||
"python-docx==1.2.0",
|
||||
"markdownify==0.13.1"
|
||||
]
|
||||
}
|
||||
},
|
||||
"xlsx": {
|
||||
"default": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"docling==2.80.0",
|
||||
"unstructured[xlsx]",
|
||||
"markitdown[xlsx]==0.1.5",
|
||||
"pandas==3.0.1",
|
||||
"tabulate==0.9.0",
|
||||
"openpyxl==3.1.5"
|
||||
]
|
||||
},
|
||||
"Darwin-x86_64": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"docling==2.40.0",
|
||||
"docling-parse==4.0.0",
|
||||
"numpy<2",
|
||||
"markitdown[xlsx]==0.1.5",
|
||||
"pandas<3.0.0",
|
||||
"tabulate==0.9.0",
|
||||
"openpyxl==3.1.5"
|
||||
]
|
||||
}
|
||||
},
|
||||
"pptx": {
|
||||
"default": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"docling==2.80.0",
|
||||
"unstructured[pptx]",
|
||||
"markitdown[pptx]==0.1.5",
|
||||
"python-pptx==1.0.2",
|
||||
"markdownify==0.13.1"
|
||||
]
|
||||
},
|
||||
"Darwin-x86_64": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"docling==2.40.0",
|
||||
"docling-parse==4.0.0",
|
||||
"numpy<2",
|
||||
"markitdown[pptx]==0.1.5",
|
||||
"python-pptx==1.0.2",
|
||||
"markdownify==0.13.1"
|
||||
]
|
||||
}
|
||||
},
|
||||
"html": {
|
||||
"default": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"trafilatura==1.12.2",
|
||||
"domscribe",
|
||||
"markitdown==0.1.5",
|
||||
"html2text==2024.2.26",
|
||||
"beautifulsoup4==4.14.3",
|
||||
"httpx==0.28.1",
|
||||
"chardet==5.2.0",
|
||||
"pyppeteer==2.0.0",
|
||||
"selenium==4.25.0"
|
||||
]
|
||||
}
|
||||
},
|
||||
"xls": {
|
||||
"default": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"unstructured[xlsx]",
|
||||
"markitdown[xls]==0.1.5",
|
||||
"pandas==3.0.1",
|
||||
"tabulate==0.9.0",
|
||||
"xlrd==2.0.1",
|
||||
"olefile==0.47"
|
||||
]
|
||||
},
|
||||
"Darwin-x86_64": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"markitdown[xls]==0.1.5",
|
||||
"pandas<3.0.0",
|
||||
"tabulate==0.9.0",
|
||||
"xlrd==2.0.1",
|
||||
"olefile==0.47",
|
||||
"openpyxl==3.1.5"
|
||||
]
|
||||
}
|
||||
},
|
||||
"doc": {
|
||||
"default": {
|
||||
"python": "3.12",
|
||||
"dependencies": []
|
||||
}
|
||||
},
|
||||
"ppt": {
|
||||
"default": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"docling==2.80.0",
|
||||
"unstructured[pptx]",
|
||||
"markitdown[pptx]==0.1.5",
|
||||
"python-pptx==1.0.2",
|
||||
"markdownify==0.13.1",
|
||||
"olefile==0.47"
|
||||
]
|
||||
},
|
||||
"Darwin-x86_64": {
|
||||
"python": "3.12",
|
||||
"dependencies": [
|
||||
"docling==2.40.0",
|
||||
"docling-parse==4.0.0",
|
||||
"numpy<2",
|
||||
"markitdown[pptx]==0.1.5",
|
||||
"python-pptx==1.0.2",
|
||||
"markdownify==0.13.1",
|
||||
"olefile==0.47"
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
@@ -105,11 +105,29 @@ def temp_pdf(tmp_path):
|
||||
c = canvas.Canvas(str(file_path), pagesize=letter)
|
||||
|
||||
# 尝试注册中文字体(如果可用)
|
||||
font_loaded = False
|
||||
try:
|
||||
# 使用系统字体
|
||||
pdfmetrics.registerFont(TTFont('SimSun', 'simsun.ttc'))
|
||||
c.setFont('SimSun', 12)
|
||||
# 尝试 macOS 中文字体
|
||||
for font_name, font_path, font_index in [
|
||||
('PingFangSC', '/System/Library/Fonts/PingFang.ttc', 0),
|
||||
('STHeiti', '/System/Library/Fonts/STHeiti Light.ttc', 0),
|
||||
('STHeitiMedium', '/System/Library/Fonts/STHeiti Medium.ttc', 0),
|
||||
]:
|
||||
try:
|
||||
from reportlab.pdfbase.ttfonts import TTFont
|
||||
import os
|
||||
if os.path.exists(font_path):
|
||||
# For TTC files, we need to specify the font index
|
||||
pdfmetrics.registerFont(TTFont(font_name, font_path, subfontIndex=font_index))
|
||||
c.setFont(font_name, 12)
|
||||
font_loaded = True
|
||||
break
|
||||
except Exception as e:
|
||||
continue
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if not font_loaded:
|
||||
# 回退到默认字体
|
||||
c.setFont('Helvetica', 12)
|
||||
|
||||
|
||||
@@ -68,21 +68,24 @@ class TestGetDependencies:
|
||||
def test_get_default_dependencies(self):
|
||||
"""测试获取默认依赖配置。"""
|
||||
python_ver, deps = get_dependencies(DocxReader, "Unknown-Platform")
|
||||
assert python_ver is None
|
||||
assert python_ver == "3.12"
|
||||
assert len(deps) > 0
|
||||
assert "docling" in deps
|
||||
# 检查是否有 docling 相关依赖(可能带版本号)
|
||||
assert any(dep.startswith("docling") for dep in deps)
|
||||
|
||||
def test_get_pdf_dependencies(self):
|
||||
"""测试获取 PDF 依赖。"""
|
||||
python_ver, deps = get_dependencies(PdfReader, "Darwin-arm64")
|
||||
assert python_ver is None
|
||||
assert "docling" in deps
|
||||
assert python_ver == "3.12"
|
||||
# 检查是否有 docling 相关依赖(可能带版本号)
|
||||
assert any(dep.startswith("docling") for dep in deps)
|
||||
|
||||
def test_get_html_dependencies(self):
|
||||
"""测试获取 HTML 依赖。"""
|
||||
python_ver, deps = get_dependencies(HtmlReader, "Linux-x86_64")
|
||||
assert python_ver is None
|
||||
assert "trafilatura" in deps
|
||||
assert python_ver == "3.12"
|
||||
# 检查是否有 trafilatura 相关依赖(可能带版本号)
|
||||
assert any(dep.startswith("trafilatura") for dep in deps)
|
||||
|
||||
|
||||
class TestGenerateUvCommand:
|
||||
|
||||
Reference in New Issue
Block a user