GitHub Actions Recovery Latency Audit
Use this skill to measure how quickly workflows recover after failing, and to detect groups that remain red for too long.
What this skill does
- - Reads GitHub Actions run JSON exports
- Groups by repository + workflow + branch + event
- Builds failure incidents (first failing run until next success)
- Reports recovery latency for closed incidents
- Reports unresolved incident count + oldest unresolved age
- Scores severity (
ok, warn, critical) for triage and CI gates
Inputs
Optional:
- -
RUN_GLOB (default: artifacts/github-actions/*.json) - INLINECODE5 (default:
20) - INLINECODE7 (
text or json, default: text) - INLINECODE11 (default:
4) - INLINECODE13 (default:
6) - INLINECODE15 (default:
18) - INLINECODE17 (default:
12) - INLINECODE19 (default:
36) - INLINECODE21 (default:
1) - INLINECODE23 (default:
2) - INLINECODE25 (optional fixed clock for deterministic tests)
- INLINECODE26 /
WORKFLOW_EXCLUDE (regex) - INLINECODE28 /
BRANCH_EXCLUDE (regex) - INLINECODE30 /
EVENT_EXCLUDE (regex) - INLINECODE32 /
REPO_EXCLUDE (regex) - INLINECODE34 (
0 or 1, default: 0)
Collect run JSON
CODEBLOCK0
Run
Text report:
CODEBLOCK1
JSON + fail gate:
CODEBLOCK2
Run against bundled fixtures:
CODEBLOCK3
Output contract
- - Exit
0 in report mode (default) - Exit
1 when FAIL_ON_CRITICAL=1 and one or more groups are critical - Text mode prints summary + ranked recovery-risk groups
- JSON mode prints summary + ranked groups + critical groups
GitHub Actions 恢复延迟审计
使用此技能来测量工作流在失败后的恢复速度,并检测长时间保持红色状态的组。
此技能的功能
- - 读取 GitHub Actions 运行 JSON 导出文件
- 按仓库 + 工作流 + 分支 + 事件进行分组
- 构建故障事件(从首次失败运行到下一次成功运行)
- 报告已关闭事件的恢复延迟
- 报告未解决事件数量及最旧未解决事件的时长
- 对严重程度进行评分(ok、warn、critical),用于分类和 CI 门控
输入参数
可选参数:
- - RUNGLOB(默认值:artifacts/github-actions/*.json)
- TOPN(默认值:20)
- OUTPUTFORMAT(text 或 json,默认值:text)
- MINRUNS(默认值:4)
- WARNP95HOURS(默认值:6)
- CRITICALP95HOURS(默认值:18)
- WARNOPENHOURS(默认值:12)
- CRITICALOPENHOURS(默认值:36)
- WARNOPENINCIDENTS(默认值:1)
- CRITICALOPENINCIDENTS(默认值:2)
- NOWISO(可选,用于确定性测试的固定时钟)
- WORKFLOWMATCH / WORKFLOWEXCLUDE(正则表达式)
- BRANCHMATCH / BRANCHEXCLUDE(正则表达式)
- EVENTMATCH / EVENTEXCLUDE(正则表达式)
- REPOMATCH / REPOEXCLUDE(正则表达式)
- FAILON_CRITICAL(0 或 1,默认值:0)
收集运行 JSON
bash
gh run view --json databaseId,workflowName,event,conclusion,headBranch,createdAt,url,repository \
> artifacts/github-actions/run-.json
运行
文本报告:
bash
RUN_GLOB=artifacts/github-actions/*.json \
TOP_N=15 \
bash skills/github-actions-recovery-latency-audit/scripts/recovery-latency-audit.sh
JSON + 失败门控:
bash
RUN_GLOB=artifacts/github-actions/*.json \
OUTPUT_FORMAT=json \
FAILONCRITICAL=1 \
bash skills/github-actions-recovery-latency-audit/scripts/recovery-latency-audit.sh
使用捆绑的测试数据运行:
bash
RUN_GLOB=skills/github-actions-recovery-latency-audit/fixtures/*.json \
NOW_ISO=2026-03-07T14:00:00Z \
bash skills/github-actions-recovery-latency-audit/scripts/recovery-latency-audit.sh
输出约定
- - 报告模式下退出代码为 0(默认)
- 当 FAILONCRITICAL=1 且一个或多个组处于严重状态时,退出代码为 1
- 文本模式打印摘要 + 按恢复风险排序的组
- JSON 模式打印摘要 + 排序后的组 + 严重组