feat: 添加 --advice 参数，支持快速获取执行建议

- 新增 scripts/core/advice_generator.py 建议生成器模块 - 在 config.py 中添加 DEPENDENCIES 依赖配置 - 在 lyxy_document_reader.py 中添加 -a/--advice 参数 - 复用 Reader 实例的 supports 方法检测文件类型 - 支持平台检测，对 macOS x86_64 PDF 返回特殊命令 - 添加单元测试和集成测试 - 更新 SKILL.md，引导优先使用 --advice 参数 - 更新 README.md，添加项目结构说明 - 添加 openspec/specs/cli-advice/spec.md 规范文档
2026-03-09 18:13:00 +08:00
parent 9daff73589
commit aaa1171e60
9 changed files with 757 additions and 103 deletions
--- a/README.md
+++ b/README.md
@@ -6,18 +6,23 @@

 - 使用 uv 运行脚本和测试，禁用主机 Python
 - 依赖管理：使用 `uv run --with` 按需加载依赖
- 依赖说明：详见 SKILL.md 的"依赖安装指南"章节
+- 快速获取建议：使用 `-a/--advice` 参数查看执行命令，无需手动查找依赖

 ## 项目结构

 ```
-scripts/          # 核心代码
-├── core/         # 核心模块（解析调度、异常、Markdown 工具）
-├── readers/      # 格式阅读器
-└── utils/        # 工具函数
-tests/            # 测试
-openspec/         # 规范文档
-skill/            # SKILL 文档
+scripts/                    # 核心代码
+├── core/                   # 核心模块
+│   ├── advice_generator.py # 执行建议生成器（新增）
+│   ├── parser.py           # 解析调度
+│   ├── exceptions.py       # 异常定义
+│   └── markdown.py         # Markdown 工具
+├── readers/                # 格式阅读器
+├── utils/                  # 工具函数
+└── config.py               # 配置（含 DEPENDENCIES 依赖配置）
+tests/                      # 测试
+openspec/                   # 规范文档
+skill/                      # SKILL 文档
 ```

 ## 开发工作流
@@ -160,7 +165,7 @@ uv run \
  - 编码测试（GBK、UTF-8 BOM 等）
  - 一致性测试（验证不同 Reader 解析结果的一致性）

-运行测试前，请根据测试类型使用 `uv run --with` 安装对应的依赖包。详见上方的"开发工作流"章节和 SKILL.md 的"依赖安装指南"。
+运行测试前，请根据测试类型使用 `uv run --with` 安装对应的依赖包。详见上方的"开发工作流"章节。


 ## 代码规范
@@ -196,8 +201,7 @@ skill/SKILL.md 面向 AI 用户，必须遵循 Claude Skill 构建指南的最

 - 使用 `uv run --with` 方式按需加载依赖
 - 必须使用具体的 pip 包名
- 按文档类型分组说明
- 详见 SKILL.md 的"依赖安装指南"章节
+- 使用 `-a/--advice` 参数可快速获取针对具体文件的执行命令

 ## 解析器架构

--- a/SKILL.md
+++ b/SKILL.md
@@ -5,7 +5,7 @@ license: MIT
 metadata:
  version: "1.0"
  author: lyxy
-compatibility: Requires Python 3.11+. 使用 uv run --with 方式按需加载依赖，详见"依赖安装指南"章节。
+compatibility: Requires Python 3.11+. 使用 uv run --with 方式按需加载依赖。
 ---

 # 统一文档解析 Skill
@@ -16,7 +16,7 @@ compatibility: Requires Python 3.11+. 使用 uv run --with 方式按需加载依

 **统一入口**：使用 `scripts/lyxy_document_reader.py` 作为统一的命令行入口，自动识别文件类型并执行解析。

-**依赖管理**：使用 `uv run --with` 方式按需加载解析器依赖。每次执行时根据文档类型指定对应的依赖包。
+**快速获取建议（必须优先使用）**：使用 `-a/--advice` 参数获取准确的执行建议，包含 `uv run --with ...` 命令和 `python` 命令，无需阅读此文档的后续内容。

 **支持的文档类型**：
 - **DOCX**：Word 文档
@@ -46,6 +46,7 @@ compatibility: Requires Python 3.11+. 使用 uv run --with 方式按需加载依

 | 参数 | 说明 |
 |------|------|
+| `-a` / `--advice` | 仅显示执行建议，不实际解析文件（必须优先使用） |
 | （无参数） | 输出完整 Markdown 内容 |
 | `-c` / `--count` | 字数统计 |
 | `-l` / `--lines` | 行数统计 |
@@ -56,41 +57,30 @@ compatibility: Requires Python 3.11+. 使用 uv run --with 方式按需加载依

 ## Workflow

+0. **获取执行建议（必须优先执行）**：
+   ```bash
+   python scripts/lyxy_document_reader.py --advice <文件路径或URL>
+   ```
+   - 根据建议中的命令执行即可，无需继续阅读后续内容
+
 1. **检测执行环境**：
   - 优先检测 **lyxy-runner-python skill** 是否可用
-   - 可用 → 使用 uv 隔离环境执行
+   - 可用 → 使用 lyxy-runner-python skill 执行
   - 不可用 → 回退到主机 Python 环境

 2. **识别文件类型**：
   - 根据文件扩展名自动选择对应的解析器
   - URL 自动识别为 HTML/网页类型

-3. **执行解析**：
-   - 按优先级尝试多个解析器，直到成功
-   - DOCX：docling → unstructured → pypandoc → MarkItDown → python-docx → XML
-   - XLSX：docling → unstructured → MarkItDown → pandas → XML
-   - PPTX：docling → unstructured → MarkItDown → python-pptx → XML
-   - PDF：docling OCR → unstructured OCR → docling → unstructured → MarkItDown → pypdf
-   - HTML：trafilatura → domscribe → MarkItDown → html2text
-
-4. **输出结果**：
+3. **输出结果**：
   - 返回 Markdown 格式内容或统计信息

-### 基本语法
-
-使用 `uv run --with` 按需加载依赖包：
-
-```bash
-# 根据文档类型选择对应的依赖包
-uv run --with <依赖包1> --with <依赖包2> ... \
-     scripts/lyxy_document_reader.py <文件路径或URL>
-```
-
-具体的依赖包列表请参考下方的"依赖安装指南"。
-
 ### 使用示例

 ```bash
+# 获取执行建议（必须优先执行此命令）
+python scripts/lyxy_document_reader.py --advice document.docx
+
 # 读取 Word 文档
 python scripts/lyxy_document_reader.py document.docx

@@ -116,74 +106,6 @@ python scripts/lyxy_document_reader.py document.docx -s "\d{4}-\d{2}-\d{2}"
 python scripts/lyxy_document_reader.py document.docx -s "关键词" -n 5
 ```

-### 依赖安装指南
-
-使用 `uv run --with` 方式按需加载解析器依赖。以下命令适用于大多数平台（macOS ARM、Linux、Windows）。
-
-#### 平台检测
-
-在遇到问题时，可以检测你的平台：
-
-```bash
-# macOS / Linux
-uname -m  # 显示架构: x86_64 或 arm64
-uname -s  # 显示系统: Darwin 或 Linux
-
-# Windows PowerShell
-$env:OS  # 或检查环境变量
-
-# Python 跨平台检测
-python -c "import platform; print(f'{platform.system()}-{platform.machine()}')"
-```
-
-#### PDF 解析
-
-**默认命令**（适用于 macOS ARM、Linux、Windows）：
-
-```bash
-uv run --with docling --with "unstructured[pdf]" --with "markitdown[pdf]" --with pypdf --with markdownify --with chardet scripts/lyxy_document_reader.py file.pdf
-```
-
-**macOS x86_64 (Intel) 特殊说明**：
-
-此平台需要使用 Python 3.12 和特定版本的依赖：
-
-```bash
-uv run --python 3.12 --with "docling==2.40.0" --with "docling-parse==4.0.0" --with "numpy<2" --with "markitdown[pdf]" --with pypdf --with markdownify --with chardet scripts/lyxy_document_reader.py file.pdf
-```
-
-原因：`docling-parse` 5.x 无 x86_64 wheel，必须使用 4.0.0；`easyocr`（docling 的 OCR 后端）与 NumPy 2.x 不兼容。
-
-#### DOCX 解析
-
-```bash
-uv run --with docling --with "unstructured[docx]" --with "markitdown[docx]" --with pypandoc-binary --with python-docx --with markdownify --with chardet scripts/lyxy_document_reader.py file.docx
-```
-
-#### XLSX 解析
-
-```bash
-uv run --with docling --with "unstructured[xlsx]" --with "markitdown[xlsx]" --with pandas --with tabulate --with chardet scripts/lyxy_document_reader.py file.xlsx
-```
-
-#### PPTX 解析
-
-```bash
-uv run --with docling --with "unstructured[pptx]" --with "markitdown[pptx]" --with python-pptx --with markdownify --with chardet scripts/lyxy_document_reader.py file.pptx
-```
-
-#### HTML/URL 解析
-
-```bash
-uv run --with trafilatura --with domscribe --with markitdown --with html2text --with beautifulsoup4 --with httpx --with chardet scripts/lyxy_document_reader.py https://example.com
-```
-
-**需要 JavaScript 渲染的网页**，额外添加：
-
-```bash
--with pyppeteer --with selenium
-```
-
 ## 错误处理

 | 错误信息 | 原因 | 解决 |
--- a/openspec/specs/cli-advice/spec.md
+++ b/openspec/specs/cli-advice/spec.md
@@ -0,0 +1,140 @@
+## Purpose
+
+CLI 执行建议生成功能，根据文件类型返回 uv 和 python 命令，帮助 AI 快速获取准确的执行建议，无需翻阅文档。
+
+## Requirements
+
+### Requirement: 依赖配置结构
+依赖配置必须同时包含 python 版本要求和依赖包列表，按文件类型和平台组织。
+
+#### Scenario: 配置结构包含 python 和 dependencies
+- **WHEN** 访问 `config.DEPENDENCIES` 时
+- **THEN** 每个文件类型配置包含多个平台配置
+- **AND** 每个平台配置包含 `python` 字段（可为 None）和 `dependencies` 列表字段
+
+#### Scenario: default 平台配置
+- **WHEN** 平台无特殊配置时
+- **THEN** 使用 `default` 配置
+- **AND** `python` 为 `None` 表示不需要指定 `--python` 参数
+
+---
+
+### Requirement: CLI 支持 --advice 参数
+命令行工具必须支持 `-a/--advice` 参数，当指定该参数时不执行实际解析，仅输出执行建议。
+
+#### Scenario: 用户指定 --advice 参数
+- **WHEN** 用户执行 `scripts/lyxy_document_reader.py --advice <input_path>`
+- **THEN** 工具输出执行建议，不解析文件内容
+
+---
+
+### Requirement: 轻量文件类型检测
+`--advice` 参数必须复用 Reader 实例的 supports 方法识别文件类型，不打开文件。
+
+#### Scenario: 复用 Reader 实例
+- **WHEN** 检测文件类型时
+- **THEN** 使用已实例化的 readers 列表
+- **AND** 调用每个 reader 的 supports() 方法
+- **AND** 根据第一个支持的 reader 类名识别文件类型
+
+#### Scenario: 检测 PDF 文件
+- **WHEN** 输入路径以 `.pdf` 结尾（不区分大小写）
+- **THEN** PdfReader.supports() 返回 True
+- **AND** 识别为 PDF 类型
+
+#### Scenario: 检测 DOCX 文件
+- **WHEN** 输入路径以 `.docx` 结尾（不区分大小写）
+- **THEN** DocxReader.supports() 返回 True
+- **AND** 识别为 DOCX 类型
+
+#### Scenario: 检测 XLSX 文件
+- **WHEN** 输入路径以 `.xlsx` 结尾（不区分大小写）
+- **THEN** XlsxReader.supports() 返回 True
+- **AND** 识别为 XLSX 类型
+
+#### Scenario: 检测 PPTX 文件
+- **WHEN** 输入路径以 `.pptx` 结尾（不区分大小写）
+- **THEN** PptxReader.supports() 返回 True
+- **AND** 识别为 PPTX 类型
+
+#### Scenario: 检测 HTML 文件
+- **WHEN** 输入路径以 `.html` 或 `.htm` 结尾（不区分大小写）
+- **THEN** HtmlReader.supports() 返回 True
+- **AND** 识别为 HTML 类型
+
+#### Scenario: 检测 URL
+- **WHEN** 输入路径以 `http://` 或 `https://` 开头
+- **THEN** HtmlReader.supports() 返回 True
+- **AND** 识别为 HTML 类型
+
+#### Scenario: 不验证文件存在
+- **WHEN** 输入路径指向不存在的文件
+- **THEN** 仍根据 reader.supports() 返回建议，不报错
+
+---
+
+### Requirement: 平台检测
+必须检测当前平台并返回适配的命令。
+
+#### Scenario: 检测平台格式
+- **WHEN** 工具执行时
+- **THEN** 返回格式为 `{system}-{machine}`，例如 `Darwin-arm64`、`Linux-x86_64`、`Windows-AMD64`
+
+#### Scenario: macOS x86_64 PDF 特殊命令
+- **WHEN** 平台为 `Darwin-x86_64` 且文件类型为 PDF
+- **THEN** 返回包含 `--python 3.12` 和特定版本依赖的命令
+
+---
+
+### Requirement: 输出 uv 命令
+必须输出使用 `uv run --with ...` 格式的命令。
+
+#### Scenario: 生成 uv 命令
+- **WHEN** 检测到文件类型
+- **THEN** 输出格式为：`uv run [--python X.Y] --with <dep1> --with <dep2> ... scripts/lyxy_document_reader.py <input_path>`
+
+---
+
+### Requirement: 输出 python 命令
+必须输出直接使用 python 的命令及 pip 安装命令。
+
+#### Scenario: 生成 python 命令
+- **WHEN** 检测到文件类型
+- **THEN** 输出 python 命令：`python scripts/lyxy_document_reader.py <input_path>`
+- **AND** 输出 pip 安装命令：`pip install <dep1> <dep2> ...`
+
+---
+
+### Requirement: 输出格式规范
+输出必须包含文件类型、输入路径、平台（如需要）、uv 命令、python 命令和 pip 安装命令。
+
+#### Scenario: 普通平台输出格式
+- **WHEN** 平台无特殊配置
+- **THEN** 输出格式为：
+  ```
+  文件类型: <type>
+  输入路径: <input>
+
+  [uv 命令]
+  <uv_command>
+
+  [python 命令]
+  python scripts/lyxy_document_reader.py <input>
+  pip install <deps>
+  ```
+
+#### Scenario: 特殊平台输出格式
+- **WHEN** 平台有特殊配置
+- **THEN** 输出格式为：
+  ```
+  文件类型: <type>
+  输入路径: <input>
+  平台: <system-machine>
+
+  [uv 命令]
+  <uv_command>
+
+  [python 命令]
+  python scripts/lyxy_document_reader.py <input>
+  pip install <deps>
+  ```
--- a/scripts/config.py
+++ b/scripts/config.py
@@ -17,3 +17,90 @@ class Config:
    # 日志
    # 日志等级，默认只输出 ERROR 级别避免干扰 Markdown 输出
    LOG_LEVEL = "ERROR"
+
+
+# 依赖配置：按文件类型和平台组织
+# 每个平台配置包含 python 版本要求（None 表示使用默认）和依赖列表
+DEPENDENCIES = {
+    "pdf": {
+        "default": {
+            "python": None,
+            "dependencies": [
+                "docling",
+                "unstructured[pdf]",
+                "markitdown[pdf]",
+                "pypdf",
+                "markdownify",
+                "chardet"
+            ]
+        },
+        "Darwin-x86_64": {
+            "python": "3.12",
+            "dependencies": [
+                "docling==2.40.0",
+                "docling-parse==4.0.0",
+                "numpy<2",
+                "markitdown[pdf]",
+                "pypdf",
+                "markdownify",
+                "chardet"
+            ]
+        }
+    },
+    "docx": {
+        "default": {
+            "python": None,
+            "dependencies": [
+                "docling",
+                "unstructured[docx]",
+                "markitdown[docx]",
+                "pypandoc-binary",
+                "python-docx",
+                "markdownify",
+                "chardet"
+            ]
+        }
+    },
+    "xlsx": {
+        "default": {
+            "python": None,
+            "dependencies": [
+                "docling",
+                "unstructured[xlsx]",
+                "markitdown[xlsx]",
+                "pandas",
+                "tabulate",
+                "chardet"
+            ]
+        }
+    },
+    "pptx": {
+        "default": {
+            "python": None,
+            "dependencies": [
+                "docling",
+                "unstructured[pptx]",
+                "markitdown[pptx]",
+                "python-pptx",
+                "markdownify",
+                "chardet"
+            ]
+        }
+    },
+    "html": {
+        "default": {
+            "python": None,
+            "dependencies": [
+                "trafilatura",
+                "domscribe",
+                "markitdown",
+                "html2text",
+                "beautifulsoup4",
+                "httpx",
+                "chardet",
+                "pyppeteer",
+                "selenium"
+            ]
+        }
+    }
+}
--- a/scripts/core/init.py
+++ b/scripts/core/init.py
@@ -16,6 +16,7 @@ from .markdown import (
    search_markdown,
 )
 from .parser import parse_input, process_content, output_result
+from .advice_generator import generate_advice

 __all__ = [
    "LyxyDocumentError",
@@ -32,4 +33,5 @@ __all__ = [
    "parse_input",
    "process_content",
    "output_result",
+    "generate_advice",
 ]
--- a/scripts/core/advice_generator.py
+++ b/scripts/core/advice_generator.py
@@ -0,0 +1,251 @@
+"""建议生成器模块，根据文件类型和平台返回执行建议。"""
+
+import platform
+from pathlib import Path
+from typing import Dict, Optional, Tuple, List, Type
+
+from config import DEPENDENCIES
+from readers import BaseReader
+from readers import (
+    PdfReader,
+    DocxReader,
+    XlsxReader,
+    PptxReader,
+    HtmlReader,
+)
+
+
+# Reader 类到配置 key 的映射
+_READER_KEY_MAP: Dict[Type[BaseReader], str] = {
+    PdfReader: "pdf",
+    DocxReader: "docx",
+    XlsxReader: "xlsx",
+    PptxReader: "pptx",
+    HtmlReader: "html",
+}
+
+
+def detect_file_type_light(input_path: str, readers: List[BaseReader]) -> Optional[Type[BaseReader]]:
+    """
+    轻量文件类型检测，复用 Reader 的 supports 方法。
+
+    Args:
+        input_path: 文件路径或 URL
+        readers: 已实例化的 reader 列表
+
+    Returns:
+        支持该输入的 Reader 类，无法识别返回 None
+    """
+    for reader in readers:
+        if reader.supports(input_path):
+            return reader.__class__
+    return None
+
+
+def get_platform() -> str:
+    """
+    获取当前平台标识，格式为 {system}-{machine}。
+
+    Returns:
+        平台标识，例如 "Darwin-arm64"、"Linux-x86_64"、"Windows-AMD64"
+    """
+    system = platform.system()
+    machine = platform.machine()
+    return f"{system}-{machine}"
+
+
+def get_dependencies(reader_cls: Type[BaseReader], platform_id: str) -> Tuple[Optional[str], list]:
+    """
+    获取指定 Reader 类和平台的依赖配置。
+
+    Args:
+        reader_cls: Reader 类
+        platform_id: 平台标识（如 "Darwin-arm64"）
+
+    Returns:
+        (python_version, dependencies) 元组
+        - python_version: 需要的 python 版本，None 表示使用默认
+        - dependencies: 依赖包列表
+    """
+    key = _READER_KEY_MAP.get(reader_cls)
+    if not key or key not in DEPENDENCIES:
+        return None, []
+
+    type_config = DEPENDENCIES[key]
+
+    # 先尝试匹配特定平台
+    if platform_id in type_config:
+        config = type_config[platform_id]
+        return config.get("python"), config.get("dependencies", [])
+
+    # 使用 default 配置
+    if "default" in type_config:
+        config = type_config["default"]
+        return config.get("python"), config.get("dependencies", [])
+
+    return None, []
+
+
+def generate_uv_command(
+    dependencies: list,
+    input_path: str,
+    python_version: Optional[str] = None,
+    script_path: str = "scripts/lyxy_document_reader.py"
+) -> str:
+    """
+    生成 uv run 命令。
+
+    Args:
+        dependencies: 依赖包列表
+        input_path: 输入文件路径或 URL
+        python_version: 需要的 python 版本，None 表示不指定
+        script_path: 脚本路径
+
+    Returns:
+        uv run 命令字符串
+    """
+    parts = ["uv run"]
+
+    if python_version:
+        parts.append(f"--python {python_version}")
+
+    for dep in dependencies:
+        # 处理包含空格的依赖（如 unstructured[pdf]），需要加引号
+        if "[" in dep or " " in dep:
+            parts.append(f'--with "{dep}"')
+        else:
+            parts.append(f"--with {dep}")
+
+    parts.append(f"{script_path} {input_path}")
+
+    return " ".join(parts)
+
+
+def generate_python_command(
+    dependencies: list,
+    input_path: str,
+    script_path: str = "scripts/lyxy_document_reader.py"
+) -> Tuple[str, str]:
+    """
+    生成 python 命令和 pip 安装命令。
+
+    Args:
+        dependencies: 依赖包列表
+        input_path: 输入文件路径或 URL
+        script_path: 脚本路径
+
+    Returns:
+        (python_command, pip_command) 元组
+    """
+    python_cmd = f"python {script_path} {input_path}"
+
+    # 构建 pip install 命令，处理带引号的依赖
+    pip_parts = ["pip install"]
+    for dep in dependencies:
+        pip_parts.append(dep)
+    pip_cmd = " ".join(pip_parts)
+
+    return python_cmd, pip_cmd
+
+
+def format_advice(
+    file_type: str,
+    input_path: str,
+    platform_id: str,
+    uv_command: str,
+    python_command: str,
+    pip_command: str,
+    has_platform_specific: bool = False
+) -> str:
+    """
+    格式化建议输出。
+
+    Args:
+        file_type: 文件类型
+        input_path: 输入路径
+        platform_id: 平台标识
+        uv_command: uv 命令
+        python_command: python 命令
+        pip_command: pip 安装命令
+        has_platform_specific: 是否使用了平台特殊配置
+
+    Returns:
+        格式化后的建议文本
+    """
+    lines = []
+
+    # 文件类型和输入路径
+    lines.append(f"文件类型: {file_type.upper()}")
+    lines.append(f"输入路径: {input_path}")
+
+    # 平台信息（仅当使用了特殊配置时显示）
+    if has_platform_specific:
+        lines.append(f"平台: {platform_id}")
+
+    lines.append("")
+
+    # uv 命令
+    lines.append("[uv 命令]")
+    lines.append(uv_command)
+    lines.append("")
+
+    # python 命令
+    lines.append("[python 命令]")
+    lines.append(python_command)
+    lines.append(pip_command)
+
+    return "\n".join(lines)
+
+
+def generate_advice(
+    input_path: str,
+    readers: List[BaseReader],
+    script_path: str = "scripts/lyxy_document_reader.py"
+) -> Optional[str]:
+    """
+    生成完整的执行建议。
+
+    Args:
+        input_path: 输入文件路径或 URL
+        readers: 已实例化的 reader 列表
+        script_path: 脚本路径
+
+    Returns:
+        格式化的建议文本，无法识别文件类型返回 None
+    """
+    # 检测文件类型，获取 Reader 类
+    reader_cls = detect_file_type_light(input_path, readers)
+    if not reader_cls:
+        return None
+
+    # 获取配置 key 和显示名称
+    key = _READER_KEY_MAP.get(reader_cls, "unknown")
+    file_type = key
+
+    # 获取平台
+    platform_id = get_platform()
+
+    # 获取依赖配置
+    python_version, dependencies = get_dependencies(reader_cls, platform_id)
+
+    # 判断是否使用了平台特殊配置
+    has_platform_specific = False
+    if key in DEPENDENCIES:
+        type_config = DEPENDENCIES[key]
+        if platform_id in type_config and "default" in type_config:
+            has_platform_specific = True
+
+    # 生成命令
+    uv_command = generate_uv_command(dependencies, input_path, python_version, script_path)
+    python_command, pip_command = generate_python_command(dependencies, input_path, script_path)
+
+    # 格式化输出
+    return format_advice(
+        file_type,
+        input_path,
+        platform_id,
+        uv_command,
+        python_command,
+        pip_command,
+        has_platform_specific
+    )
--- a/scripts/lyxy_document_reader.py
+++ b/scripts/lyxy_document_reader.py
@@ -32,6 +32,7 @@ from core import (
    output_result,
    parse_input,
    process_content,
+    generate_advice,
 )
 from readers import READERS

@@ -43,6 +44,13 @@ def main() -> None:

    parser.add_argument("input_path", help="DOCX、PPTX、XLSX、PDF、HTML 文件或 URL")

+    parser.add_argument(
+        "-a",
+        "--advice",
+        action="store_true",
+        help="仅显示执行建议，不实际解析文件",
+    )
+
    parser.add_argument(
        "-n",
        "--context",
@@ -80,6 +88,16 @@ def main() -> None:
    # 实例化所有 readers
    readers = [ReaderCls() for ReaderCls in READERS]

+    # --advice 模式：仅显示建议，不解析
+    if args.advice:
+        advice = generate_advice(args.input_path, readers, "scripts/lyxy_document_reader.py")
+        if advice:
+            print(advice)
+        else:
+            print(f"错误: 无法识别文件类型: {args.input_path}")
+            sys.exit(1)
+        return
+
    try:
        content, failures = parse_input(args.input_path, readers)
    except FileDetectionError as e:
--- a/tests/test_cli/test_main.py
+++ b/tests/test_cli/test_main.py
@@ -4,6 +4,41 @@ import pytest
 import os


+class TestCLIAdviceOption:
+    """测试 CLI --advice 参数功能。"""
+
+    def test_advice_option_pdf(self, cli_runner):
+        """测试 -a/--advice 选项对 PDF 文件。"""
+        stdout, stderr, exit_code = cli_runner(["test.pdf", "-a"])
+
+        assert exit_code == 0
+        assert "文件类型: PDF" in stdout
+        assert "[uv 命令]" in stdout
+        assert "[python 命令]" in stdout
+
+    def test_advice_option_docx(self, cli_runner):
+        """测试 --advice 选项对 DOCX 文件。"""
+        stdout, stderr, exit_code = cli_runner(["test.docx", "--advice"])
+
+        assert exit_code == 0
+        assert "文件类型: DOCX" in stdout
+
+    def test_advice_option_url(self, cli_runner):
+        """测试 --advice 选项对 URL。"""
+        stdout, stderr, exit_code = cli_runner(["https://example.com", "--advice"])
+
+        assert exit_code == 0
+        assert "文件类型: HTML" in stdout
+
+    def test_advice_option_unknown(self, cli_runner):
+        """测试 --advice 选项对未知文件类型。"""
+        stdout, stderr, exit_code = cli_runner(["test.xyz", "--advice"])
+
+        assert exit_code != 0
+        output = stdout + stderr
+        assert "无法识别" in output or "错误" in output
+
+
 class TestCLIDefaultOutput:
    """测试 CLI 默认输出功能。"""

--- a/tests/test_core/test_advice_generator.py
+++ b/tests/test_core/test_advice_generator.py
@@ -0,0 +1,195 @@
+"""测试 advice_generator 模块。"""
+
+import pytest
+from unittest.mock import patch
+
+from core.advice_generator import (
+    detect_file_type_light,
+    get_platform,
+    get_dependencies,
+    generate_uv_command,
+    generate_python_command,
+    format_advice,
+    generate_advice,
+)
+from readers import READERS, PdfReader, DocxReader, HtmlReader
+
+
+@pytest.fixture
+def readers():
+    """提供已实例化的 readers 列表。"""
+    return [ReaderCls() for ReaderCls in READERS]
+
+
+class TestDetectFileTypeLight:
+    """测试轻量文件类型检测函数。"""
+
+    def test_detect_pdf(self, readers):
+        """测试检测 PDF 文件。"""
+        reader_cls = detect_file_type_light("test.pdf", readers)
+        assert reader_cls == PdfReader
+
+    def test_detect_docx(self, readers):
+        """测试检测 DOCX 文件。"""
+        reader_cls = detect_file_type_light("test.docx", readers)
+        assert reader_cls == DocxReader
+
+    def test_detect_html(self, readers):
+        """测试检测 HTML 文件。"""
+        reader_cls = detect_file_type_light("test.html", readers)
+        assert reader_cls == HtmlReader
+
+    def test_detect_url(self, readers):
+        """测试检测 URL。"""
+        reader_cls = detect_file_type_light("https://example.com", readers)
+        assert reader_cls == HtmlReader
+
+    def test_detect_unknown(self, readers):
+        """测试检测未知文件类型。"""
+        reader_cls = detect_file_type_light("test.xyz", readers)
+        assert reader_cls is None
+
+
+class TestGetPlatform:
+    """测试平台检测函数。"""
+
+    def test_get_platform_format(self):
+        """测试平台标识格式正确。"""
+        platform_id = get_platform()
+        # 格式应该是 {system}-{machine}
+        assert "-" in platform_id
+        # 至少包含两个部分
+        assert len(platform_id.split("-")) >= 2
+
+
+class TestGetDependencies:
+    """测试依赖获取函数。"""
+
+    def test_get_default_dependencies(self):
+        """测试获取默认依赖配置。"""
+        python_ver, deps = get_dependencies(DocxReader, "Unknown-Platform")
+        assert python_ver is None
+        assert len(deps) > 0
+        assert "docling" in deps
+
+    def test_get_pdf_dependencies(self):
+        """测试获取 PDF 依赖。"""
+        python_ver, deps = get_dependencies(PdfReader, "Darwin-arm64")
+        assert python_ver is None
+        assert "docling" in deps
+
+    def test_get_html_dependencies(self):
+        """测试获取 HTML 依赖。"""
+        python_ver, deps = get_dependencies(HtmlReader, "Linux-x86_64")
+        assert python_ver is None
+        assert "trafilatura" in deps
+
+
+class TestGenerateUvCommand:
+    """测试 uv 命令生成函数。"""
+
+    def test_generate_simple_command(self):
+        """测试生成简单的 uv 命令。"""
+        cmd = generate_uv_command(
+            ["pkg1", "pkg2"],
+            "input.pdf",
+            script_path="scripts/lyxy_document_reader.py"
+        )
+        assert "uv run" in cmd
+        assert "--with pkg1" in cmd
+        assert "--with pkg2" in cmd
+        assert "input.pdf" in cmd
+
+    def test_generate_with_python_version(self):
+        """测试生成带 python 版本的 uv 命令。"""
+        cmd = generate_uv_command(
+            ["pkg1"],
+            "input.pdf",
+            python_version="3.12",
+            script_path="scripts/lyxy_document_reader.py"
+        )
+        assert "--python 3.12" in cmd
+
+    def test_generate_with_quoted_deps(self):
+        """测试生成带引号的依赖（如 unstructured[pdf]）。"""
+        cmd = generate_uv_command(
+            ["unstructured[pdf]", "pkg2"],
+            "input.pdf",
+            script_path="scripts/lyxy_document_reader.py"
+        )
+        assert '--with "unstructured[pdf]"' in cmd
+
+
+class TestGeneratePythonCommand:
+    """测试 python 命令生成函数。"""
+
+    def test_generate_python_command(self):
+        """测试生成 python 命令。"""
+        python_cmd, pip_cmd = generate_python_command(
+            ["pkg1", "pkg2"],
+            "input.pdf",
+            script_path="scripts/lyxy_document_reader.py"
+        )
+        assert python_cmd == "python scripts/lyxy_document_reader.py input.pdf"
+        assert pip_cmd == "pip install pkg1 pkg2"
+
+
+class TestFormatAdvice:
+    """测试建议格式化输出函数。"""
+
+    def test_format_without_platform(self):
+        """测试无平台特殊配置的格式化输出。"""
+        output = format_advice(
+            "pdf",
+            "test.pdf",
+            "Darwin-arm64",
+            "uv run --with docling ...",
+            "python scripts/lyxy_document_reader.py test.pdf",
+            "pip install docling ...",
+            has_platform_specific=False
+        )
+        assert "文件类型: PDF" in output
+        assert "输入路径: test.pdf" in output
+        assert "平台:" not in output
+        assert "[uv 命令]" in output
+        assert "[python 命令]" in output
+
+    def test_format_with_platform(self):
+        """测试有平台特殊配置的格式化输出。"""
+        output = format_advice(
+            "pdf",
+            "test.pdf",
+            "Darwin-x86_64",
+            "uv run --python 3.12 ...",
+            "python ...",
+            "pip install ...",
+            has_platform_specific=True
+        )
+        assert "平台: Darwin-x86_64" in output
+
+
+class TestGenerateAdvice:
+    """测试完整建议生成函数。"""
+
+    def test_generate_advice_pdf(self, readers):
+        """测试生成 PDF 的建议。"""
+        advice = generate_advice("test.pdf", readers, "scripts/lyxy_document_reader.py")
+        assert advice is not None
+        assert "文件类型: PDF" in advice
+        assert "[uv 命令]" in advice
+        assert "[python 命令]" in advice
+
+    def test_generate_advice_url(self, readers):
+        """测试生成 URL 的建议。"""
+        advice = generate_advice(
+            "https://example.com",
+            readers,
+            "scripts/lyxy_document_reader.py"
+        )
+        assert advice is not None
+        assert "文件类型: HTML" in advice
+
+    def test_generate_advice_unknown(self, readers):
+        """测试生成未知类型的建议。"""
+        advice = generate_advice("test.xyz", readers, "scripts/lyxy_document_reader.py")
+        assert advice is None