优化路径
This commit is contained in:
@@ -11,23 +11,23 @@
|
||||
### 使用 uv(推荐)
|
||||
|
||||
```bash
|
||||
# DOCX - 推荐依赖
|
||||
uv run --with "markitdown[docx]" skills/lyxy-reader-office/scripts/parser.py /path/to/file.docx
|
||||
# DOCX - 全依赖
|
||||
uv run --with docling --with "unstructured[docx]" --with markdownify --with pypandoc-binary --with "markitdown[docx]" --with python-docx scripts/parser.py /path/to/file.docx
|
||||
|
||||
# PPTX - 推荐依赖
|
||||
uv run --with "markitdown[pptx]" skills/lyxy-reader-office/scripts/parser.py /path/to/file.pptx
|
||||
# PPTX - 全依赖
|
||||
uv run --with docling --with "unstructured[pptx]" --with markdownify --with "markitdown[pptx]" --with python-pptx scripts/parser.py /path/to/file.pptx
|
||||
|
||||
# XLSX - 推荐依赖
|
||||
uv run --with "markitdown[xlsx]" skills/lyxy-reader-office/scripts/parser.py /path/to/file.xlsx
|
||||
# XLSX - 全依赖
|
||||
uv run --with docling --with "unstructured[xlsx]" --with markdownify --with "markitdown[xlsx]" --with pandas --with tabulate scripts/parser.py /path/to/file.xlsx
|
||||
|
||||
# PDF - 推荐依赖
|
||||
uv run --with "markitdown[pdf]" --with pypdf skills/lyxy-reader-office/scripts/parser.py /path/to/file.pdf
|
||||
# PDF - 全依赖(基础文本提取)
|
||||
uv run --with docling --with "unstructured[pdf]" --with markdownify --with "markitdown[pdf]" --with pypdf scripts/parser.py /path/to/file.pdf
|
||||
|
||||
# PDF OCR 高精度模式
|
||||
uv run --with docling --with pypdf skills/lyxy-reader-office/scripts/parser.py /path/to/file.pdf --high-res
|
||||
# PDF OCR 高精度模式(全依赖)
|
||||
uv run --with docling --with "unstructured[pdf]" --with unstructured-paddleocr --with "paddlepaddle==2.6.2" --with ml-dtypes --with markdownify --with "markitdown[pdf]" --with pypdf scripts/parser.py /path/to/file.pdf --high-res
|
||||
```
|
||||
|
||||
> **注意**:以上为最小推荐依赖,更多解析器依赖和完整安装命令请查阅 `scripts/README.md` 的安装部分。
|
||||
> **说明**:以上为全依赖安装命令,包含所有解析器以获得最佳兼容性。详细的解析器优先级和对比请查阅 `scripts/README.md`。
|
||||
|
||||
## 各格式输出特点
|
||||
|
||||
|
||||
Reference in New Issue
Block a user