Google Analytics & Search Console Data-Driven Improvement
Analyze GSC and GA4 data, combined with browser auditing and source code review, to generate improvement plans covering six dimensions: SEO, Performance, Content Strategy, UX, Conversion Rate, and Technical Issues.
Data Storage
All runtime data is stored in $DATA_DIR, separated from skill code.
CODEBLOCK0
Workflow
CODEBLOCK1
Phase 1: Select Data Source & Collect Data
1a. Initialize directories:
CODEBLOCK2
1b. Ask user to choose data source:
Present three modes for the user to choose from:
Choose how to obtain GSC/GA4 data:
A. API auto-collection (recommended, most complete data)
Requires creating a Google Cloud Service Account and configuring API auth. First-time setup takes ~10 minutes; subsequent analyses collect data automatically.
B. Manual CSV export (zero config, simplest)
You export data files from GA4 and GSC web consoles yourself, and I'll analyze them. No API configuration needed.
C. Browser audit only (no GA4/GSC data needed)
I'll visit the site directly for technical auditing and code analysis without using GA4/GSC data. Best for quick technical checks.
Enter the corresponding branch based on user selection:
Mode A: API Auto-Collection
Check .env: Read $DATA_DIR/.env; if missing config, guide the user to fill it in.
Configuration required from user (write to $DATA_DIR/.env after collection):
| Variable | Description |
|---|
| INLINECODE3 | Website URL to audit (e.g., https://example.com) |
| INLINECODE5 |
Absolute path to the Service Account JSON key file on your machine |
|
GSC_SITE_URL | Site address in Search Console (see format note below) |
|
GA4_PROPERTY_ID | GA4 Property ID (numeric only) |
|
SOURCE_CODE_PATH | (Optional) Path to the project source code |
|
PSI_API_KEY | (Optional) PageSpeed Insights API Key to avoid rate limiting |
GSCSITEURL format note: GSC has two property types with different formats. The value must match the type registered in GSC, otherwise a 403 permission error will be returned:
| GSC Property Type | GSCSITEURL Format | Example |
|---|
| Domain property | INLINECODE10 | INLINECODE11 |
| URL-prefix property |
Full URL |
https://example.com |
How to check: In the Search Console property selector (top-left), if it shows a bare domain name it's a Domain property (use sc-domain: prefix); if it shows a full URL it's a URL-prefix property.
Detailed auth setup steps in references/gsc-api-guide.md.
CODEBLOCK3
Collect data (scripts auto-read auth from .env):
CODEBLOCK4
First-time use requires installing dependencies:
CODEBLOCK5
Script usage details in references/gsc-api-guide.md and references/ga4-api-guide.md.
Mode B: Manual CSV Export
Send the following export instructions to the user, asking them to place files in $DATA_DIR/data/:
Export GSC data:
- 1. Open Google Search Console → Select your site
- Click "Search results" (Performance) in the left menu
- Set date range to last 3 months, click "Export" → "Download CSV"
- Save the downloaded CSV as INLINECODE15
Export GA4 data (export the following reports):
- 1. Open Google Analytics → Select your property
- Export "Pages and screens" report:
- Left menu: "Reports" → "Engagement" → "Pages and screens"
- Click the share icon (top-right) → "Download file" → CSV
- Save as $DATA_DIR/data/ga4_pages.csv
- 3. Export "Traffic acquisition" report:
- Left menu: "Reports" → "Acquisition" → "Traffic acquisition"
- Export CSV → Save as $DATA_DIR/data/ga4_acquisition.csv
- 4. Export "Landing pages" report:
- Left menu: "Reports" → "Engagement" → "Landing pages"
- Export CSV → Save as $DATA_DIR/data/ga4_landing.csv
Let me know when the export is complete, and I'll read the files to start analysis.
Also ask the user for:
- - Target website URL (required, write to
SITE_URL in $DATA_DIR/.env) - Source code path (optional, write to
SOURCE_CODE_PATH)
After receiving files, read CSV files from $DATA_DIR/data/ and proceed to Phase 2-3 analysis.
Mode C: Browser Audit Only
Only ask the user for:
- - Target website URL (required)
- Source code path (optional)
Write to $DATA_DIR/.env and skip directly to Phase 4 (site audit) and Phase 5 (source code review), skipping Phase 2-3.
Phase 2: GSC Data Analysis
Read GSC data (JSON or CSV) from $DATA_DIR/data/, analyze according to the "SEO" dimension thresholds in references/metrics-glossary.md.
Key outputs:
- - High-impression low-CTR keywords (best targets for title/description optimization)
- Keywords ranked 4-10 (highest ROI to push into top 3)
- Pages with declining ranking trends
- Index coverage and sitemap health status
Output: Top 10 SEO optimization opportunities with data evidence.
Phase 3: GA4 Data Analysis
Read GA4 data (JSON or CSV) from $DATA_DIR/data/, analyze according to "Content Strategy", "User Experience", and "Conversion Rate" dimension thresholds in references/metrics-glossary.md.
Key outputs:
- - Traffic trends and channel effectiveness
- High-traffic low-engagement / high-bounce-rate pages
- Mobile vs desktop experience gaps
- Conversion funnel drop-off points
Output: Top 10 GA4 insights with data evidence.
Phase 4: Live Site Audit
Use agent-browser to visit $SITE_URL:
CODEBLOCK6
Save screenshots to $DATA_DIR/tmp/.
PageSpeed Insights performance audit (auto-appends PSI_API_KEY from .env if present):
CODEBLOCK7
PSI failure fallback: If a 429 (quota exceeded) or other error is returned, check whether "PageSpeed Insights API" has been enabled in the Google Cloud project (see references/gsc-api-guide.md Step 1). When PSI data is missing, continue with subsequent phases and note the missing performance data in the report.
Extract Core Web Vitals from PSI; thresholds in the "Performance" dimension of references/metrics-glossary.md.
If GA4 data is available, take screenshots (desktop + mobile) for each of the Top 10 landing pages, recording visual and interaction issues.
When no source code is available, extract front-end metadata via browser:
CODEBLOCK8
Output: Performance scores + visual issue checklist.
Phase 5: Source Code Review
If SOURCE_CODE_PATH is configured in .env, analyze project source code. Skip if no source code is available.
Check items detailed in the "Technical Issues" checklist in references/metrics-glossary.md. Core focus:
- - SEO: Meta tag completeness, JSON-LD, robots.txt / sitemap.xml, image alt, H1 conventions
- Performance: JS/CSS splitting and lazy loading, image formats and responsive images, third-party scripts, render-blocking resources
- Technical:
<html lang>, viewport, HTTPS, canonical URL, internal dead links
Output: Code-level improvement checklist.
Phase 5b: SEO & GEO Optimization Checklist Audit
Run through the SEO & GEO optimization checklist in references/SEO-GEO-Optimization-Checklist.md to evaluate the site's search engine and generative AI readiness.
Run the audit scripts to collect data automatically:
CODEBLOCK9
Each script supports:
- -
--url URL — target site (or reads SITE_URL from .env) - INLINECODE36 — audit all pages from sitemap.xml
- INLINECODE37 — audit specific pages
- INLINECODE38 — limit pages when using INLINECODE39
- INLINECODE40 — write JSON report to file (default: stdout)
The scripts check the following categories against the target site:
- 1. Structured Data (JSON-LD) (
seo_audit.py): Coverage, SSR output, schema types, content uniqueness - Meta Tags & Open Graph (
seo_audit.py): Title/description length, canonical, hreflang, og:image, Twitter Card - Heading Structure (
seo_audit.py): H1 count, question-style heading ratio (GEO signal) - Sitemap (
seo_audit.py): Presence, page count, lastmod, robots.txt declaration - AI Readability (GEO) (
geo_audit.py): llms.txt/llms-full.txt, robots.txt AI crawler rules (GPTBot, ClaudeBot, etc.) - Content Depth (
geo_audit.py): Word count (CJK-aware), intro summary detection, FAQ/HowTo sections, schema presence - Performance (
perf_audit.py): Load time (FCP proxy), compression (Brotli/gzip), HTML size, CDN detection - Security (
perf_audit.py): HSTS, HTTPS, CSP, X-Frame-Options, Cache-Control
For any items not covered by the scripts (e.g. off-page authority, visual content review), use the detection commands in Section 7 of the checklist reference.
Output: SEO/GEO readiness checklist with pass/fail status for each item and specific improvement recommendations, classified by the P0-P3 priority matrix in Section 8 of the checklist.
Phase 6: Generate Improvement Report
Organize output according to the "Priority Matrix" (P0-P3) in references/metrics-glossary.md. Use the following template:
CODEBLOCK10
Save the report to $DATA_DIR/data/improvement-report.md.
Companion Skills
- - SEO implementation → INLINECODE50
- Browser automation → INLINECODE51
- Frontend redesign → INLINECODE52
Reference Docs
GA4 auth setup, preset templates, dimensions & metrics |
|
references/metrics-glossary.md | Six analysis dimensions: thresholds, diagnostics, priority matrix |
|
references/SEO-GEO-Optimization-Checklist.md | SEO & GEO optimization checklist: structured data, AI readability, content depth, technical SEO, performance, off-page authority, detection commands, priority matrix |
Google Analytics & Search Console 数据驱动优化
分析 GSC 和 GA4 数据,结合浏览器审计和源代码审查,生成涵盖六个维度的优化方案:SEO、性能、内容策略、用户体验、转化率和技术问题。
数据存储
所有运行时数据存储在 $DATA_DIR 中,与技能代码分离。
<项目根目录>/.skills-data/google-analytics-and-search-improve/
.env # 配置(认证、URL等),由脚本自动加载
data/ # GSC/GA4/PSI 数据(JSON 或 CSV)
tmp/ # 截图和临时文件
cache/ # API 响应缓存
configs/ # 配置文件
logs/ # 执行日志
venv/ # Python 虚拟环境
工作流程
分析进度:
- - [ ] 阶段 1:选择数据源并收集数据
- [ ] 阶段 2:GSC 数据分析
- [ ] 阶段 3:GA4 数据分析
- [ ] 阶段 4:在线网站审计
- [ ] 阶段 5:源代码审查
- [ ] 阶段 5b:SEO 和 GEO 优化清单审计
- [ ] 阶段 6:生成优化报告
阶段 1:选择数据源并收集数据
1a. 初始化目录:
bash
DATA_DIR=.skills-data/google-analytics-and-search-improve
mkdir -p $DATA_DIR/{data,cache,logs,tmp}
1b. 让用户选择数据源:
向用户展示三种模式供选择:
选择获取 GSC/GA4 数据的方式:
A. API 自动收集(推荐,数据最完整)
需要创建 Google Cloud 服务账号并配置 API 认证。首次设置约需 10 分钟;后续分析将自动收集数据。
B. 手动 CSV 导出(零配置,最简单)
您自行从 GA4 和 GSC 控制台导出数据文件,我将进行分析。无需 API 配置。
C. 仅浏览器审计(无需 GA4/GSC 数据)
我将直接访问网站进行技术审计和代码分析,不使用 GA4/GSC 数据。适合快速技术检查。
根据用户选择进入相应分支:
模式 A:API 自动收集
检查 .env:读取 $DATA_DIR/.env;如果缺少配置,引导用户填写。
需要用户提供的配置(收集后写入 $DATA_DIR/.env):
| 变量 | 描述 |
|---|
| SITEURL | 要审计的网站 URL(例如 https://example.com) |
| GOOGLEAPPLICATION_CREDENTIALS |
您机器上服务账号 JSON 密钥文件的
绝对路径 |
| GSC
SITEURL | Search Console 中的网站地址(参见下方格式说明) |
| GA4
PROPERTYID | GA4 媒体资源 ID(仅数字) |
| SOURCE
CODEPATH | (可选)项目源代码路径 |
| PSI
APIKEY | (可选)PageSpeed Insights API 密钥,用于避免速率限制 |
GSCSITEURL 格式说明:GSC 有两种属性类型,格式不同。该值必须与 GSC 中注册的类型匹配,否则将返回 403 权限错误:
| GSC 属性类型 | GSCSITEURL 格式 | 示例 |
|---|
| 域名属性 | sc-domain:domain | sc-domain:example.com |
| 网址前缀属性 |
完整 URL | https://example.com |
如何检查:在 Search Console 属性选择器(左上角)中,如果显示裸域名则为域名属性(使用 sc-domain: 前缀);如果显示完整 URL 则为网址前缀属性。
详细的认证设置步骤请参见 references/gsc-api-guide.md。
bash
cat > $DATA_DIR/.env <
SITE_URL=由用户提供
GOOGLEAPPLICATIONCREDENTIALS=由用户提供(绝对路径)
GSCSITEURL=由用户提供(注意 sc-domain: 或 https:// 格式)
GA4PROPERTYID=由用户提供
SOURCECODEPATH=由用户提供
PSIAPIKEY=
EOF
收集数据(脚本自动从 .env 读取认证信息):
bash
set -a; source $DATA_DIR/.env; set +a
python scripts/gscquery.py --dimensions query --limit 500 -o $DATADIR/data/gsc_queries.json
python scripts/gscquery.py --dimensions page --limit 500 -o $DATADIR/data/gsc_pages.json
python scripts/gscquery.py --dimensions device,country -o $DATADIR/data/gsc_devices.json
python scripts/gscquery.py --dimensions date -o $DATADIR/data/gsc_trends.json
python scripts/gscquery.py --mode sitemaps -o $DATADIR/data/gsc_sitemaps.json
python scripts/ga4query.py --preset trafficoverview -o $DATADIR/data/ga4traffic.json
python scripts/ga4query.py --preset toppages --limit 100 -o $DATADIR/data/ga4pages.json
python scripts/ga4query.py --preset useracquisition -o $DATADIR/data/ga4acquisition.json
python scripts/ga4query.py --preset devicebreakdown -o $DATADIR/data/ga4devices.json
python scripts/ga4query.py --preset landingpages --limit 50 -o $DATADIR/data/ga4landing.json
python scripts/ga4query.py --preset userbehavior --limit 100 -o $DATADIR/data/ga4behavior.json
python scripts/ga4query.py --preset conversionevents -o $DATADIR/data/ga4conversions.json
首次使用需要安装依赖:
bash
python3 -m venv $DATADIR/venv && source $DATADIR/venv/bin/activate
pip install -r scripts/requirements.txt
脚本使用详情请参见 references/gsc-api-guide.md 和 references/ga4-api-guide.md。
模式 B:手动 CSV 导出
向用户发送以下导出说明,要求他们将文件放入 $DATA_DIR/data/:
导出 GSC 数据:
- 1. 打开 Google Search Console → 选择您的网站
- 点击左侧菜单中的搜索结果(效果)
- 将日期范围设置为最近 3 个月,点击导出→下载 CSV
- 将下载的 CSV 保存为 $DATADIR/data/gsc_export.csv
导出 GA4 数据(导出以下报告):
- 1. 打开 Google Analytics → 选择您的媒体资源
- 导出页面和屏幕报告:
- 左侧菜单:报告→互动→页面和屏幕
- 点击分享图标(右上角)→下载文件→ CSV
- 保存为 $DATADIR/data/ga4pages.csv
- 3. 导出流量获取报告:
- 左侧菜单:报告→获客→流量获取
- 导出 CSV → 保存为 $DATADIR/data/ga4acquisition.csv
- 4. 导出着陆页报告:
- 左侧菜单:报告→互动→着陆页
- 导出 CSV → 保存为 $DATADIR/data/ga4landing.csv
导出完成后请告知我,我将读取文件开始分析。
同时向用户询问:
- - 目标网站 URL(必填,写入 $DATADIR/.env 中的 SITEURL)
- 源代码路径(可选,写入 SOURCECODEPATH)
收到文件后,从 $DATA_DIR/data/ 读取 CSV 文件并进入阶段 2-3 分析。
模式 C:仅浏览器审计
仅向用户询问:
写入 $DATA_DIR/.env 并直接跳转到阶段 4(网站审计)和阶段 5(源代码审查),跳过阶段 2-3。
阶段 2:GSC 数据分析
从 $DATADIR/data/ 读取 GSC 数据(JSON 或 CSV),根据 references/metrics-glossary.md 中的SEO维度阈值进行分析。
关键输出: