ABS Data API Skill
Query live ABS datasets, return data + citations, optional tables/charts/reports.
Bundled Resources
| File | Purpose |
|---|
| INLINECODE0 | Metadata cache manager — refresh catalog, search all 1,200+ dataflows, generate structured metadata |
| INLINECODE1 |
NL → dataset mapper — curated lookup + fuzzy fallback + ambiguity detection |
|
scripts/abs_query.py | Query engine — fetches data, formats output, summary/report/describe modes |
|
scripts/test_presets.py | Preset validation — tests all presets against live API, pass/fail summary |
|
presets.json | 20 validated preset queries for common indicators |
|
metadata.overrides.json | Manual overrides for discontinued datasets and nicer labels |
|
references/dataset-catalog.md | ~55 curated datasets with IDs, versions, notes (human reference) |
|
references/api-guide.md | ABS API URL patterns, response structure, example queries |
|
references/sdmx-patterns.md | Dimension codes (REGION, TSEST, FREQ, MEASURE) per dataset |
Quick Start
CODEBLOCK0
Workflow
Step 1 — Identify the dataset
- 1. Check
references/dataset-catalog.md for the dataflow ID and version - If not found, run
python3 scripts/abs_search.py "<user query>" for fuzzy match + ambiguity hints - If still not found, run
python3 scripts/abs_cache.py search "<term>" (searches all 1,200+ dataflows)
Step 2 — Determine dimension key
- 1. Check
presets.json — if a preset exists, use it directly - Read
references/sdmx-patterns.md for common dimension codes - For an unfamiliar dataset, fetch its structure:
CODEBLOCK1
Step 3 — Query the data
CODEBLOCK2
Step 4 — Format and deliver
- - Default text format includes citation. Use
--format table for markdown tables. - For charts, requires
matplotlib; gracefully falls back if not installed. - Use
--summary latest for quick briefs with change context. - Use
--report macro-snapshot for a full multi-indicator briefing. - Always include the citation line in any response to the user.
Presets (20 validated)
Common indicator queries are bundled in presets.json. All validated live March 2026.
CODEBLOCK3
Key presets: cpi-annual-change, unemployment-rate, participation-rate, employment-level,
underemployment-rate, labour-force-size, gdp-annual-change, wage-annual-change,
population-national, dwelling-prices-mean, trade-balance, goods-exports, goods-imports,
household-spending-change.
Output Formats
| Flag | Output |
|---|
| (default) | Human-readable text with friendly labels + citation |
| INLINECODE33 |
Markdown table with friendly labels and rendered periods |
|
--format csv | CSV with raw codes + citation comment |
|
--format json | JSON with raw codes +
*_label fields +
TIME_PERIOD_rendered |
|
--chart | PNG chart with dataset title, subtitle, latest-point annotation |
|
--summary latest | Latest value + previous + absolute/percentage-point deltas + textual summary |
|
--report macro-snapshot | Compact multi-indicator macro briefing (7 key economic indicators) |
|
--citation-style analyst | Analyst-style source footnote block |
|
--flat-view | AllDimensions format (wider; may be large) |
Period Rendering
All output modes now render periods in human-readable format:
- -
2026-01 → January 2026 - INLINECODE44 → December quarter 2025
- INLINECODE45 → March quarter 2025
- Ranges: INLINECODE46
This applies to table headers, text output, citations, chart labels, and summary/report output.
JSON Output with Labels
INLINECODE47 returns both raw dimension codes and friendly *_label fields:
CODEBLOCK4
Backward compatible — raw codes are preserved.
Ambiguity Detection
INLINECODE49 classifies ambiguity when multiple datasets match:
- - frequency — monthly vs quarterly
- geography — national vs state vs SA2/LGA
- measure — index vs % change vs level
- series — original vs seasonally adjusted
- dataset — distinct series cover the same topic
Prints clarifying questions to help the user or agent narrow the query.
Cache and Metadata
| Command | Description |
|---|
| INLINECODE50 | Fetch all dataflows from ABS, save to INLINECODE51 |
| INLINECODE52 |
Generate
metadata.generated.json from presets + catalog + overrides |
|
abs_cache.py status | Show cache age, dataflow count, structure count, metadata status |
|
abs_cache.py search <term> | Search across all cached dataflows |
|
abs_cache.py structure <ID> [VER] | Fetch and cache DSD for a specific dataflow |
Runtime metadata priority: metadata.generated.json > catalog.json > dataset-catalog.md.
Override quirks (discontinued datasets, nicer labels) in metadata.overrides.json.
Validation
CODEBLOCK5
Ambiguity Rules
- - Multiple matching datasets: prefer the most specific. E.g. for "inflation",
CPI beats CPI_M beats PPI. - No dimension key provided: use
all — the API will return everything; then filter. If the response is large (>100 observations), the tool warns you. - Version unknown: look up from generated metadata, then catalog; try
1.0.0 as last resort. - User asks for "latest": always add
--latest flag (uses lastNObservations=1). - Census data requested: redirect to the
census-database skill; this skill handles ABS time-series only. - Chart requested but matplotlib missing: output text/table format and note how to install matplotlib.
- Retail Trade (RT) requested: DISCONTINUED after June 2025. Use
HSI_M or BUSINESS_TURNOVER instead. - RPPI requested: note the API only has data to ~2021-Q4. Use
RES_DWELL_ST for current dwelling prices.
Citation Format
All responses include a citation:
Source: Australian Bureau of Statistics, <Full Dataset Name> (Cat. <catalogue-number>; dataset <ID>; v<version>). <human-readable-period>. Retrieved via ABS Data API: <url>.
Example:
Source: Australian Bureau of Statistics, Consumer Price Index (Cat. 6401.0; dataset CPI; v2.0.0). January 2026. Retrieved via ABS Data API: https://data.api.abs.gov.au/rest/data/ABS,CPI,2.0.0/.
What's New in v1.0.2
1. Metadata Generation
- -
gen-metadata command: Builds unified metadata from presets + live catalog + manual overrides - Auto-refresh: Generated metadata automatically updates when older than 24 hours
- Ensures all datasets are findable and correctly labeled, even as ABS API evolves
CODEBLOCK6
2. Smart Ambiguity Detection
- - Classifies ambiguity when multiple datasets match a user query (frequency, geography, measure, series, dataset)
- Provides clarifying questions grouped by intent (prices, wages, employment, housing, etc.)
- Flags discontinued datasets with replacement suggestions (e.g., RT → HSI_M)
- Uses curated intent groups + ambiguity tags to guide disambiguation
CODEBLOCK7
3. Summary Mode with Change Context
- -
--summary latest: Shows latest value + previous + absolute deltas + brief summary - Automatically detects rates/growth measures and uses percentage-point notation instead of misleading relative % changes
- Example: Unemployment rate rises from 4.0% to 4.3% → "change of +0.3 percentage points" (NOT "+7.5% relative change")
- Applies to: unemployment, participation, inflation rates, growth measures
- - Ideal for quick briefings and executive summaries
CODEBLOCK8
4. Macro-Snapshot Report
- -
--report macro-snapshot: Single-command economic briefing covering 7 key indicators - Fetches CPI, unemployment, participation, employment, GDP growth, wage growth, household spending
- All with change context and period rendering
- Perfect for media snippets or executive briefings
CODEBLOCK9
5. Percentage-Point Delta Fix
- - Smart detection: Automatically recognizes rates and growth measures via keyword matching
- Applies percentage-point notation to avoid confusion with relative % changes
- Examples:
- Unemployment: 4.0% → 4.3% =
+0.3 percentage points (not +7.5%)
- CPI: 3.5% → 3.2% =
-0.3 percentage points
- Wage growth: 4.1% → 4.0% =
-0.1 percentage points
- - Applies to all output modes: text, table, JSON, summary
6. Metadata Overrides (metadata.overrides.json)
- - Discontinued datasets (RT → HSI_M, RPPI stale warning)
- Friendly names for complex dataset IDs
- Replacement hints with explanations
- Easy to extend for future dataset changes
The query engine appends this automatically. Do not strip it from tool output.
Changelog
v1.0.2 (March 2026)
New Features:
- - ✨ Metadata generation (
gen-metadata command) — builds unified metadata from presets + catalog + overrides with auto-refresh - ✨ Smart ambiguity detection — classifies multiple matches by type (frequency, geography, measure, series, dataset) and provides grouped clarifying questions
- ✨ Summary mode with change context (
--summary latest) — shows latest + previous + absolute deltas + brief summary - ✨ Macro-snapshot report (
--report macro-snapshot) — single-command economic briefing covering 7 key indicators - ✨ Percentage-point delta fix — rates/growth measures automatically use pp notation instead of misleading relative % changes
- ✨ Intent grouping — curated entries now include
intent_group and ambiguity_tags for smarter disambiguation
Improvements:
- - Discontinued dataset detection (RT → HSI_M, RPPI stale warning)
- Better metadata overrides system for dataset quirks
- Enhanced search with ambiguity classification
- All output modes now respect percentage-point notation where applicable
Affected Scripts:
- -
abs_cache.py — added gen-metadata command and generate_metadata() function - INLINECODE91 — added ambiguity detection, intent grouping, and clarifying questions
- INLINECODE92 — added
--summary latest, --report macro-snapshot, percentage-point delta detection - INLINECODE95 — new file for manual dataset overrides
v1.0.1 (Previous)
- - Base preset system with 20 validated queries
- Curated dataset catalog and SDMX dimension references
- Cache refresh and fuzzy search capabilities
ABS 数据 API 技能
查询实时 ABS 数据集,返回数据及引用信息,可选的表格/图表/报告。
捆绑资源
| 文件 | 用途 |
|---|
| scripts/abscache.py | 元数据缓存管理器 — 刷新目录、搜索全部 1,200+ 数据流、生成结构化元数据 |
| scripts/abssearch.py |
自然语言 → 数据集映射器 — 精选查找 + 模糊回退 + 歧义检测 |
| scripts/abs_query.py | 查询引擎 — 获取数据、格式化输出、摘要/报告/描述模式 |
| scripts/test_presets.py | 预设验证 — 针对实时 API 测试所有预设,通过/失败摘要 |
| presets.json | 20 个经过验证的常用指标预设查询 |
| metadata.overrides.json | 针对已停用数据集和更友好标签的手动覆盖 |
| references/dataset-catalog.md | 约 55 个精选数据集,包含 ID、版本、说明(人工参考) |
| references/api-guide.md | ABS API URL 模式、响应结构、示例查询 |
| references/sdmx-patterns.md | 每个数据集的维度代码(REGION、TSEST、FREQ、MEASURE) |
快速开始
bash
1. 预热缓存(执行一次;24 小时后自动刷新)
python3 scripts/abs_cache.py refresh
python3 scripts/abs_cache.py gen-metadata
2. 搜索数据集(含歧义提示)
python3 scripts/abs_search.py 失业率
3. 列出预设
python3 scripts/abs_query.py --list-presets
4. 描述预设
python3 scripts/abs_query.py --describe-preset cpi-annual-change
5. 查询最新数据
python3 scripts/abs_query.py --preset cpi-annual-change --latest --format table
6. 摘要简报(最新数据 + 变化背景)
python3 scripts/abs_query.py --preset cpi-annual-change --summary latest
7. 宏观快照
python3 scripts/abs_query.py --report macro-snapshot
8. 图表
python3 scripts/abs_query.py --preset gdp-chain-volume --start-period 2020-Q1 --chart
工作流程
步骤 1 — 识别数据集
- 1. 查看 references/dataset-catalog.md 获取数据流 ID 和版本
- 如果未找到,运行 python3 scripts/abssearch.py <用户查询> 进行模糊匹配 + 歧义提示
- 如果仍未找到,运行 python3 scripts/abscache.py search <术语>(搜索全部 1,200+ 数据流)
步骤 2 — 确定维度键
- 1. 查看 presets.json — 如果存在预设,直接使用
- 阅读 references/sdmx-patterns.md 获取常用维度代码
- 对于不熟悉的数据集,获取其结构:
bash
python3 scripts/abs_cache.py structure
步骤 3 — 查询数据
bash
python3 scripts/abs_query.py [KEY] [--version V] [--start-period P] [--end-period P] [--latest] [--format text|csv|json|table] [--chart] [--out FILE]
步骤 4 — 格式化并交付
- - 默认文本格式包含引用信息。使用 --format table 生成 Markdown 表格。
- 图表需要 matplotlib;如未安装则优雅降级。
- 使用 --summary latest 生成带变化背景的快速简报。
- 使用 --report macro-snapshot 生成完整的多指标简报。
- 始终在给用户的任何响应中包含引用行。
预设(20 个已验证)
常用指标查询已捆绑在 presets.json 中。全部于 2026 年 3 月实时验证。
bash
列出所有可用预设
python3 scripts/abs_query.py --list-presets
描述预设(显示其测量内容及使用时机)
python3 scripts/abs_query.py --describe-preset unemployment-rate
运行预设
python3 scripts/abs_query.py --preset cpi-annual-change --latest --format table
python3 scripts/abs_query.py --preset unemployment-rate --latest
python3 scripts/abs_query.py --preset gdp-annual-change --chart
python3 scripts/abs_query.py --preset wage-annual-change --start-period 2020-Q1
python3 scripts/abs_query.py --preset population-national --format csv
python3 scripts/abs_query.py --preset dwelling-prices-mean --format table
python3 scripts/abs_query.py --preset trade-balance --start-period 2024-01
python3 scripts/abs_query.py --preset household-spending-change --summary latest
关键预设:cpi-annual-change、unemployment-rate、participation-rate、employment-level、
underemployment-rate、labour-force-size、gdp-annual-change、wage-annual-change、
population-national、dwelling-prices-mean、trade-balance、goods-exports、goods-imports、
household-spending-change。
输出格式
| 标志 | 输出 |
|---|
| (默认) | 人类可读文本,含友好标签 + 引用信息 |
| --format table |
Markdown 表格,含友好标签和渲染后的时间段 |
| --format csv | CSV 格式,含原始代码 + 引用注释 |
| --format json | JSON 格式,含原始代码 + *label 字段 + TIMEPERIOD_rendered |
| --chart | PNG 图表,含数据集标题、副标题、最新数据点标注 |
| --summary latest | 最新值 + 前值 + 绝对/百分点变化 + 文本摘要 |
| --report macro-snapshot | 紧凑的多指标宏观简报(7 个关键经济指标) |
| --citation-style analyst | 分析师风格的来源脚注块 |
| --flat-view | AllDimensions 格式(更宽;可能较大) |
时间段渲染
所有输出模式现在以人类可读格式渲染时间段:
- - 2026-01 → 2026 年 1 月
- 2025-Q4 → 2025 年 12 月季度
- 2025-Q1 → 2025 年 3 月季度
- 范围:2024 年 3 月季度 至 2025 年 12 月季度
这适用于表格标题、文本输出、引用信息、图表标签以及摘要/报告输出。
带标签的 JSON 输出
--format json 返回原始维度代码和友好的 *_label 字段:
json
{
TSEST: 20,
TSEST_label: 经季节调整,
TIME_PERIOD: 2026-02,
TIMEPERIODrendered: 2026 年 2 月,
value: 4.277
}
向后兼容 — 原始代码保持不变。
歧义检测
当多个数据集匹配时,abs_search.py 对歧义进行分类:
- - 频率 — 月度 vs 季度
- 地理范围 — 全国 vs 州 vs SA2/LGA
- 度量 — 指数 vs 百分比变化 vs 水平值
- 序列 — 原始值 vs 经季节调整
- 数据集 — 不同序列覆盖同一主题
打印澄清问题以帮助用户或代理缩小查询范围。
缓存和元数据
| 命令 | 描述 |
|---|
| abscache.py refresh | 从 ABS 获取所有数据流,保存至 ~/.cache/abs-data-api/catalog.json |
| abscache.py gen-metadata |
从预设 + 目录 + 覆盖生成 metadata.generated.json |
| abs_cache.py status | 显示缓存年龄、数据流数量、结构数量、元数据状态 |
| abs_cache.py search <术语> | 在所有缓存的数据流中搜索 |
| abs_cache.py structure [VER] | 获取并缓存特定数据流的 DSD |
运行时元数据优先级:metadata.generated.json > catalog.json > dataset-catalog.md。
在 metadata.overrides.json 中覆盖特殊情况(已停用数据集、更友好的标签)。
验证
bash
python3 scripts/test_presets.py # 测试所有预设
python3 scripts/test_presets.py --verbose # 含计时
python3 scripts/test_presets.py --