Skill/skills/lyxy-reader-html/references/examples.md

# 示例

## URL 输入 - 提取完整文档内容

```bash
# 使用 uv（推荐）
uv run --with trafilatura --with domscribe --with markitdown --with html2text --with httpx --with beautifulsoup4 scripts/parser.py https://example.com

# 直接使用 Python
python scripts/parser.py https://example.com
```

## HTML 文件输入 - 提取完整文档内容

```bash
# 使用 uv（推荐）
uv run --with trafilatura --with domscribe --with markitdown --with html2text --with beautifulsoup4 scripts/parser.py page.html

# 直接使用 Python
python scripts/parser.py page.html
```

## 获取文档字数

```bash
uv run --with trafilatura --with html2text --with beautifulsoup4 scripts/parser.py -c https://example.com
```

## 获取文档行数

```bash
uv run --with trafilatura --with html2text --with beautifulsoup4 scripts/parser.py -l https://example.com
```

## 提取所有标题

```bash
uv run --with trafilatura --with html2text --with beautifulsoup4 scripts/parser.py -t https://example.com
```

## 提取指定章节

```bash
uv run --with trafilatura --with html2text --with beautifulsoup4 scripts/parser.py -tc "关于我们" https://example.com
```

## 搜索关键词

```bash
uv run --with trafilatura --with html2text --with beautifulsoup4 scripts/parser.py -s "关键词" -n 3 https://example.com
```

## 降级到直接 Python 执行

仅当 lyxy-runner-python skill 不存在时使用：

```bash
python3 scripts/parser.py https://example.com
```