feat: 添加多平台依赖支持
为不同平台提供特定的依赖 extras,解决 macOS x86_64 的依赖兼容性问题。 - 添加平台特定的 PDF 解析 extras:pdf-win, pdf-macos-intel, pdf-macos-arm, pdf-linux - 添加平台特定的 Office 文档 extras:office-win, office-macos-intel, office-macos-arm, office-linux - macOS x86_64 使用硬编码版本:docling==2.40.0, docling-parse==4.0.0 - 移除通用的 pdf 和 office extras,强制用户选择平台 - 更新 SKILL.md 添加详细的多平台依赖安装指南 - 更新 README.md 添加平台特定安装说明 - 在 .gitignore 中添加 uv.lock - 删除现有的 uv.lock 文件 - 创建 multi-platform-dependencies 规范文档
This commit is contained in:
83
SKILL.md
83
SKILL.md
@@ -5,7 +5,7 @@ license: MIT
|
||||
metadata:
|
||||
version: "1.0"
|
||||
author: lyxy
|
||||
compatibility: Requires Python 3.11+. 优先使用 lyxy-runner-python skill 执行(自动管理依赖)。回退到主机 Python 时需手动安装依赖:DOCX(docling unstructured markitdown pypandoc-binary python-docx markdownify chardet) / XLSX(docling unstructured markitdown pandas tabulate chardet) / PPTX(docling unstructured markitdown python-pptx markdownify chardet) / PDF(docling unstructured unstructured-paddleocr markitdown pypdf markdownify chardet) / HTML(trafilatura domscribe markitdown html2text beautifulsoup4 httpx chardet) / HTTP增强(pyppeteer selenium)
|
||||
compatibility: Requires Python 3.11+. 优先使用 lyxy-runner-python skill 执行(自动管理依赖)。回退到主机 Python 时需根据平台手动安装依赖:Windows(pdf-win/office-win) / macOS Intel(pdf-macos-intel/office-macos-intel,需Python 3.12) / macOS ARM(pdf-macos-arm/office-macos-arm) / Linux(pdf-linux/office-linux)。详见"多平台依赖安装指南"章节。
|
||||
---
|
||||
|
||||
# 统一文档解析 Skill
|
||||
@@ -117,6 +117,87 @@ python scripts/lyxy_document_reader.py document.docx -s "\d{4}-\d{2}-\d{2}"
|
||||
python scripts/lyxy_document_reader.py document.docx -s "关键词" -n 5
|
||||
```
|
||||
|
||||
### 多平台依赖安装指南
|
||||
|
||||
**重要说明**:本项目为不同平台提供特定的依赖配置,请根据你的平台选择对应的 extra。
|
||||
|
||||
#### 平台检测
|
||||
|
||||
在使用前,请先检测你的平台:
|
||||
|
||||
```bash
|
||||
# macOS / Linux
|
||||
uname -m # 显示架构: x86_64 或 arm64
|
||||
uname -s # 显示系统: Darwin 或 Linux
|
||||
|
||||
# Windows PowerShell
|
||||
$env:OS # 或检查环境变量
|
||||
|
||||
# Python 跨平台检测
|
||||
python -c "import platform; print(f'{platform.system()}-{platform.machine()}')"
|
||||
```
|
||||
|
||||
#### PDF 解析依赖
|
||||
|
||||
根据你的平台选择对应的安装命令:
|
||||
|
||||
**Windows x86_64**
|
||||
```bash
|
||||
uv run --with "lyxy-document[pdf-win]" scripts/lyxy_document_reader.py file.pdf
|
||||
```
|
||||
- 依赖:docling, unstructured, PaddleOCR
|
||||
- Python:>=3.11
|
||||
- 特殊说明:无
|
||||
|
||||
**macOS x86_64 (Intel)**
|
||||
⚠️ **特殊平台**:需要特定版本配置
|
||||
```bash
|
||||
uv run --python 3.12 --with "lyxy-document[pdf-macos-intel]" scripts/lyxy_document_reader.py file.pdf
|
||||
```
|
||||
- 依赖:docling==2.40.0, docling-parse==4.0.0, numpy<2
|
||||
- Python:**必须 3.12**
|
||||
- 特殊说明:
|
||||
- `docling-parse` 5.x 无 x86_64 wheel,必须使用 4.0.0
|
||||
- `easyocr`(docling 的 OCR 后端)与 NumPy 2.x 不兼容
|
||||
|
||||
**macOS arm64 (Apple Silicon)**
|
||||
```bash
|
||||
uv run --with "lyxy-document[pdf-macos-arm]" scripts/lyxy_document_reader.py file.pdf
|
||||
```
|
||||
- 依赖:docling, unstructured
|
||||
- Python:>=3.11
|
||||
- 特殊说明:无
|
||||
|
||||
**Linux**
|
||||
```bash
|
||||
uv run --with "lyxy-document[pdf-linux]" scripts/lyxy_document_reader.py file.pdf
|
||||
```
|
||||
- 依赖:docling, unstructured
|
||||
- Python:>=3.11
|
||||
- 特殊说明:无
|
||||
|
||||
#### Office 文档依赖
|
||||
|
||||
**Windows x86_64**
|
||||
```bash
|
||||
uv run --with "lyxy-document[office-win]" scripts/lyxy_document_reader.py file.docx
|
||||
```
|
||||
|
||||
**macOS x86_64 (Intel)**
|
||||
```bash
|
||||
uv run --python 3.12 --with "lyxy-document[office-macos-intel]" scripts/lyxy_document_reader.py file.docx
|
||||
```
|
||||
|
||||
**macOS arm64 (Apple Silicon)**
|
||||
```bash
|
||||
uv run --with "lyxy-document[office-macos-arm]" scripts/lyxy_document_reader.py file.docx
|
||||
```
|
||||
|
||||
**Linux**
|
||||
```bash
|
||||
uv run --with "lyxy-document[office-linux]" scripts/lyxy_document_reader.py file.docx
|
||||
```
|
||||
|
||||
### 主机 Python 环境依赖安装
|
||||
|
||||
当 lyxy-runner-python 不可用时,需要根据文档类型手动安装依赖:
|
||||
|
||||
Reference in New Issue
Block a user