citation-injector
# Citation Injector (deterministic baseline edits; budget-as-constraints)
Purpose: make the pipeline converge when the draft is:
- locally citation-dense but **globally under-cited** (too few unique keys), or
- overly reusing the same citations across many subsections.
This skill is intentionally **conservative and scriptable**:
- the script edits `output/DRAFT.md` directly using the budget report as constraints
- injections stay evidence-neutral (NO NEW FACTS) and use only in-scope keys already listed for each H3
## Inputs
- `output/DRAFT.md`
- `output/CITATION_BUDGET_REPORT.md` (from `citation-diversifier`)
- `outline/outline.yml` (H3 id/title mapping)
- `citations/ref.bib` (must contain every injected key)
## Outputs
- `output/DRAFT.md` (updated in place)
- `output/CITATION_INJECTION_REPORT.md` (PASS/FAIL + what you changed)
## Non-negotiables (NO NEW FACTS)
- Only inject keys listed for that H3 in the budget report.
- Do not introduce new numbers, new benchmarks, or superiority claims.
- Do not add narration templates (`This subsection ...`, `Next, we ...`).
- Do not produce cite dumps like `[@a; @b; @c]` as the only citations in a paragraph.
## Paper-voice injection patterns (safe sentence shapes)
Use these as *sentence intentions* (paraphrase; do not copy verbatim).
1) Axis-anchored exemplars (preferred)
- `Systems such as X [@a] and Y [@b] instantiate <axis/design point>, whereas Z [@c] explores a contrasting point under a different protocol.`
2) Parenthetical grounding (short, low-risk)
- `... (e.g., X [@a], Y [@b], Z [@c]).`
3) Cluster pointer + contrast hint
- `Representative implementations span both <cluster A> (X [@a], Y [@b]) and <cluster B> (Z [@c]), suggesting that the trade-off hinges on <lens>.`
4) Decision-lens pointer
- `For builders choosing between <A> and <B>, prior systems provide concrete instantiations on both sides (X [@a]; Y [@b]; Z [@c]).`
5) Evaluation-lens pointer (still evidence-neutral)
- `Across commonly used agent evaluations, systems such as X [@a] and Y [@b] illustrate how <lens> is operationalized, while Z [@c] highlights a different constraint.`
6) Contrast without list voice
- `While many works operationalize <topic> via <mechanism> (X [@a]; Y [@b]), others treat it as <alternative> (Z [@c]), which changes the failure modes discussed later.`
## Anti-patterns (high-signal “budget dump” voice)
Avoid these stems (they read like automated injection):
- `A few representative references include ...`
- `Notable lines of work include ...`
- `Concrete examples include ...`
If your draft contains these, rewrite them immediately using the patterns above (keep citation keys unchanged).
## Placement guidance
- Prefer inserting citations where the subsection already states a concrete contrast or decision lens.
- If you must add a new sentence/mini-paragraph, place it early (often after paragraph 1) so it reads as positioning, not as an afterthought.
- Keep injections subsection-specific: mention the subsection lens (H3 title / `contrast_hook`) so the same sentence cannot be copy-pasted into every H3.
## Workflow
1) Read the budget report (`output/CITATION_BUDGET_REPORT.md`)
- Treat `Global target (policy; blocking)` as the PASS line for the pipeline gate (derived from `queries.md:citation_target`; A150++ default: `recommended`).
- If `Gap: 0`, do nothing: write a short PASS report and move on.
- Otherwise, for each H3 with suggested keys, pick enough keys to close the gap to target:
- small gaps: 3-6 keys / H3
- A150++ gaps: often 6-12 keys / H3
Prefer keys that are unused globally and avoid repeating the same new keys across many H3s.
2) Inject in the right subsection
- Use `outline/outline.yml` to confirm H3 ordering and ensure the injected sentence lands inside the correct `###` subsection.
3) Inject with paper voice
- Prefer one short, axis-anchored sentence over a long enumerator sentence.
- Keep injections evidence-neutral (NO NEW FACTS) and avoid new numbers.
- Before you commit an injected key, confirm it exists in `citations/ref.bib`.
4) Write `output/CITATION_INJECTION_REPORT.md`
- Record which H3s you touched and which keys were added.
- Mark `- Status: PASS` only when the global target is met.
5) Verify
- Rerun the validator script (below) to recheck the global target.
- Then run `draft-polisher` to smooth any residual injection voice (citation keys must remain unchanged).
## Done criteria
- `output/CITATION_INJECTION_REPORT.md` exists and is `- Status: PASS`.
- `pipeline-auditor` no longer FAILs on “unique citations too low”.
## Script (optional; deterministic injector + validator)
You usually do not run this manually; it exists so a pipeline runner can deterministically apply a baseline injection and validate the target.
### Quick Start
- `python scripts/run.py --workspace workspaces/<ws>`
### All Options
- `--workspace <dir>`
- `--unit-id <U###>` (optional; for logs)
- `--inputs <semicolon-separated>` (rare override; prefer defaults)
- `--outputs <semicolon-separated>` (rare override; default validates `output/CITATION_INJECTION_REPORT.md`)
- `--checkpoint <C#>` (optional)
### Examples
- After you generate the budget report and want the script to apply the baseline injection:
- `python scripts/run.py --workspace workspaces/<ws>`
标签
skill
ai