示例

提取完整文档内容

# DOCX
uv run --with "markitdown[docx]" scripts/parser.py /path/to/report.docx

# PPTX
uv run --with "markitdown[pptx]" scripts/parser.py /path/to/slides.pptx

# XLSX
uv run --with "markitdown[xlsx]" scripts/parser.py /path/to/data.xlsx

# PDF
uv run --with "markitdown[pdf]" --with pypdf scripts/parser.py /path/to/doc.pdf

获取文档字数

uv run --with "markitdown[docx]" scripts/parser.py -c /path/to/report.docx

提取所有标题

uv run --with "markitdown[docx]" scripts/parser.py -t /path/to/report.docx

提取指定章节

uv run --with "markitdown[docx]" scripts/parser.py -tc "第一章" /path/to/report.docx

搜索关键词

uv run --with "markitdown[docx]" scripts/parser.py -s "关键词" -n 3 /path/to/report.docx

PDF OCR 高精度解析

uv run --with docling --with pypdf scripts/parser.py /path/to/scanned.pdf --high-res

降级到直接 Python 执行

仅当 lyxy-runner-python skill 不存在时使用：

python3 scripts/parser.py /path/to/file.docx

1.1 KiB Raw Blame History

示例

提取完整文档内容

获取文档字数

提取所有标题

提取指定章节

搜索关键词

PDF OCR 高精度解析

降级到直接 Python 执行

1.1 KiB

Raw Blame History