citation-diversifier

# Citation Diversifier (budget-as-constraints) [NO NEW FACTS] Purpose: fix a common survey failure mode: - the draft reads under-cited (or reuses the same few citations everywhere) - the pipeline fails the **global unique-citation** gate This skill does **not** change prose by itself. It produces a constraint sheet: `output/CITATION_BUDGET_REPORT.md`. ## Inputs - `output/DRAFT.md` - `outline/outline.yml` (H3 ids/titles; used to allocate budgets per subsection) - `outline/writer_context_packs.jsonl` (source of `allowed_bibkeys_{selected,mapped,chapter,global}` per H3) - `citations/ref.bib` ## Output - `output/CITATION_BUDGET_REPORT.md` ## Non-negotiables (NO NEW FACTS) - Only propose citation keys that exist in `citations/ref.bib`. - Only propose keys that are **in-scope** for the target H3 (prefer subsection-first scope; use chapter/global only when truly cross-cutting). - Do not propose “padding citations” that would require adding new claims or new numbers. ## What a good budget report looks like (contract) The report should feel like a *constraint sheet*, not a random list: - It states the **blocking policy target** and the **gap-to-target** (how many unique keys are missing; policy default is `recommended`). - For each H3, it proposes a scope-safe budget sized to actually close the gap: - small gaps: 3-6 keys / H3 is often enough - A150++ gaps: plan for ~6-12 keys / H3 (and avoid duplicates across H3 budgets) - It gives placement guidance (where in the subsection those keys can be embedded without adding new facts). Canonical (parseable) lines required (downstream validators depend on these): - The target is derived from `queries.md:citation_target` (`recommended` by default for A150++). - `- Global target (policy; blocking): >= <N> ...` - `- Gap: <K>` (gap-to-target; if `0`, injection can be a no-op PASS) Optional (always reported; may be blocking depending on `citation_target`): - `- Global recommended target: >= <N> ...` - `- Gap to recommended: <K>` Recommended prioritization (scope-safe): - `allowed_bibkeys_selected` → `allowed_bibkeys_mapped` → `allowed_bibkeys_chapter` - Use `allowed_bibkeys_global` only for: - benchmarks/protocol papers - widely-used datasets/suites - cross-cutting surveys/method papers referenced across chapters ## How this connects to writing (LLM-first) After you generate the budget report: - Apply it using `citation-injector` (LLM edits to `output/DRAFT.md`, NO NEW FACTS). - Then run `draft-polisher` to remove any “budget dump voice” while keeping citation keys unchanged. Important: `citation-injector` is **LLM-first**. Its script is validation-only. ## Workflow 1) Diagnose the global situation - Read `output/DRAFT.md` and estimate the “unique-key gap” (or use `pipeline-auditor`’s FAIL reason). 2) Allocate budgets per H3 (scope-first) - Use `outline/outline.yml` to enumerate H3s in paper order. - For each H3, read its allowed key sets from `outline/writer_context_packs.jsonl`. - Pick a small set of *unused* keys that strengthen positioning without requiring new claims. 3) Write `output/CITATION_BUDGET_REPORT.md` Required structure: - `- Status: PASS|FAIL` - `- Global target (policy; blocking): >= <N> ...` - `- Gap: <K>` - `## Summary` (gap + strategy) - `## Per-subsection budgets` (H3 id/title → suggested keys → placement hint) ## Script (optional; deterministic report generator) If you want a deterministic first-pass budget report, run the helper script. Treat it as a baseline and refine the plan as needed. ### Quick Start - `python scripts/run.py --help` - `python scripts/run.py --workspace workspaces/<ws>` ### All Options - `--workspace <dir>` - `--unit-id <U###>` (optional) - `--inputs <semicolon-separated>` (rare override; prefer defaults) - `--outputs <semicolon-separated>` (rare override; default writes `output/CITATION_BUDGET_REPORT.md`) - `--checkpoint <C#>` (optional) ### Examples - Default IO: - `python scripts/run.py --workspace workspaces/<ws>` ## Done criteria - `output/CITATION_BUDGET_REPORT.md` exists and has actionable, in-scope budgets. - After applying the plan via `citation-injector`, `pipeline-auditor` no longer FAILs on global unique citations.

citation-diversifier

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

citation-diversifier