Database Schema Differ
What This Does
A CLI tool to compare database schemas across different environments (development, staging, production), generate migration scripts, and track schema evolution over time. Support for PostgreSQL, MySQL, SQLite, and other databases via SQLAlchemy.
Key features:
- - Schema comparison: Compare schemas between databases, branches, or points in time
- Migration generation: Automatically generate SQL migration scripts (up/down) for schema changes
- Schema snapshots: Capture and store schema snapshots for historical comparison
- Drift detection: Identify schema drift between environments (dev vs prod, etc.)
- Multiple database support: PostgreSQL, MySQL, SQLite, SQL Server, Oracle via SQLAlchemy
- Export formats: Generate SQL, JSON, or visual diff outputs
- Integration ready: Works with Alembic, Django migrations, or standalone
- Change tracking: Track schema evolution over time with versioning
- CI/CD friendly: Output machine-readable formats for automation pipelines
When To Use
- - You need to compare database schemas between development and production
- You want to generate migration scripts for schema changes
- You're managing multiple database environments and need to ensure consistency
- You need to detect schema drift in production databases
- You're refactoring databases and need to track changes
- You want to automate schema validation in CI/CD pipelines
- You need to document schema changes for compliance or team coordination
- You're onboarding new team members and need to understand schema evolution
- You want to visualize schema differences between branches or versions
Usage
Basic commands:
CODEBLOCK0
Examples
Example 1: Compare development and production databases
CODEBLOCK1
Output:
CODEBLOCK2
Example 2: Generate migration script
CODEBLOCK3
Output (migration.sql):
CODEBLOCK4
Example 3: Check for schema drift in CI
CODEBLOCK5
Output (CI failure):
CODEBLOCK6
Example 4: Track schema evolution
CODEBLOCK7
Output:
CODEBLOCK8
Example 5: Visual schema comparison
CODEBLOCK9
Output:
CODEBLOCK10
Requirements
- - Python 3.x
- SQLAlchemy (for database connectivity)
- Alembic (optional, for migration generation)
- Database drivers: psycopg2 (PostgreSQL), pymysql (MySQL), etc.
Install dependencies:
CODEBLOCK11
Limitations
- - Requires database credentials and network access to compare live databases
- Complex schema changes may require manual review of generated migrations
- Limited support for database-specific features not covered by SQLAlchemy
- Performance may be impacted with very large schemas (1000+ tables)
- No built-in support for NoSQL databases (MongoDB, Redis, etc.)
- Cannot compare encrypted or compressed database dumps
- Limited error handling for connection issues or permission problems
- No support for comparing materialized views or database functions across all DB types
- Generated migrations may not handle data migration or complex transformation
- No built-in support for distributed database comparisons
- Limited to schema structure; does not compare data or indexes optimally
- May not detect all schema differences for databases with custom types or extensions
- No support for comparing database triggers or stored procedures across all database types
- Performance may degrade with very large tables or complex relationships
- No built-in support for schema version control systems (like Liquibase or Flyway)
- Limited error recovery for malformed SQL or corrupted schema files
- No support for real-time schema change monitoring
- Cannot compare schemas across different database types (e.g., PostgreSQL vs MySQL)
- Limited support for database-specific optimizations or extensions
- No built-in notification system for schema drift alerts
- May require manual adjustment of generated migration scripts for production use
Directory Structure
The tool works with database connection strings, SQL files, or schema snapshot files. No special configuration directories are required.
Error Handling
- - Invalid database connections show helpful error messages with connection details
- Permission errors suggest checking database credentials and access rights
- Schema parsing errors show line numbers and specific SQL issues
- Comparison errors suggest checking schema compatibility or database versions
- File not found errors suggest checking paths and file permissions
- Output generation errors suggest checking disk space and write permissions
Contributing
This is a skill built by the Skill Factory. Issues and improvements should be reported through the OpenClaw project.
数据库模式差异比较工具
功能说明
一款命令行工具,用于比较不同环境(开发、测试、生产)之间的数据库模式,生成迁移脚本,并追踪模式随时间的变化。通过SQLAlchemy支持PostgreSQL、MySQL、SQLite等多种数据库。
主要特性:
- - 模式比较:比较数据库、分支或时间点之间的模式
- 迁移生成:自动为模式变更生成SQL迁移脚本(向上/向下)
- 模式快照:捕获并存储模式快照,用于历史比较
- 漂移检测:识别环境间的模式漂移(开发vs生产等)
- 多数据库支持:通过SQLAlchemy支持PostgreSQL、MySQL、SQLite、SQL Server、Oracle
- 导出格式:生成SQL、JSON或可视化差异输出
- 集成就绪:可与Alembic、Django迁移或独立使用
- 变更追踪:通过版本控制追踪模式随时间的变化
- CI/CD友好:输出机器可读格式,适用于自动化流水线
使用场景
- - 需要比较开发和生产环境之间的数据库模式
- 想要为模式变更生成迁移脚本
- 管理多个数据库环境,需要确保一致性
- 需要检测生产数据库中的模式漂移
- 正在重构数据库,需要追踪变更
- 希望在CI/CD流水线中自动化模式验证
- 需要为合规或团队协调记录模式变更
- 正在培训新团队成员,需要了解模式演变
- 想要可视化分支或版本之间的模式差异
使用方法
基本命令:
bash
比较两个数据库连接
python3 scripts/main.py compare postgresql://user:pass@host1/db postgresql://user:pass@host2/db
从模式差异生成迁移脚本
python3 scripts/main.py diff dev
db.sql proddb.sql --output migration.sql
创建模式快照供后续比较
python3 scripts/main.py snapshot postgresql://user:pass@host/db --save snapshot.json
比较当前模式与保存的快照
python3 scripts/main.py compare-snapshot postgresql://user:pass@host/db snapshot.json
生成模式间的可视化差异
python3 scripts/main.py visual-diff schema1.sql schema2.sql --html diff.html
在CI流水线中检查模式漂移
python3 scripts/main.py check-drift --expected expected
schema.json --actual actualschema.json
追踪模式随时间的变化
python3 scripts/main.py history postgresql://user:pass@host/db --days 30
示例
示例1:比较开发和生产数据库
bash
python3 scripts/main.py compare \
postgresql://devuser:devpass@localhost/dev_db \
postgresql://produser:prodpass@prod-host/prod_db \
--output diff-report.json
输出:
🔍 正在比较模式:devdb (localhost) vs proddb (prod-host)
📊 摘要:
- - 表:42 vs 45(开发环境缺少3个)
- 列:287 vs 295(8处差异)
- 索引:67 vs 72(5处差异)
- 约束:34 vs 38(4处差异)
⚠️ 发现15处差异:
- 1. 开发环境缺少表 audit_logs
→ CREATE TABLE audit_logs (...)
- 2. 开发环境缺少列 users.email_verified
→ ALTER TABLE users ADD COLUMN email_verified BOOLEAN DEFAULT FALSE
- 3. 生产环境缺少索引 idxusersemail
→ CREATE INDEX idx
usersemail ON users(email)
- 4. 约束 fkorderscustomer_id 存在差异
→ ALTER TABLE orders DROP CONSTRAINT fk
orderscustomer
idold;
→ ALTER TABLE orders ADD CONSTRAINT fk
orderscustomer_id FOREIGN KEY ...
✅ 已生成迁移报告:diff-report.json
✅ SQL迁移脚本:migration20240306143022.sql
示例2:生成迁移脚本
bash
python3 scripts/main.py diff oldschema.sql newschema.sql --format sql --output migration.sql
输出(migration.sql):
sql
-- 生成时间:2024-03-06 14:30:22
-- 数据库:PostgreSQL
-- 向上迁移
CREATE TABLE audit_logs (
id SERIAL PRIMARY KEY,
user_id INTEGER,
action VARCHAR(255),
created_at TIMESTAMP DEFAULT NOW()
);
ALTER TABLE users ADD COLUMN email_verified BOOLEAN DEFAULT FALSE;
CREATE INDEX idxusersemail ON users(email);
ALTER TABLE orders
DROP CONSTRAINT fkorderscustomeridold,
ADD CONSTRAINT fkorderscustomer_id
FOREIGN KEY (customer_id) REFERENCES customers(id)
ON DELETE CASCADE;
-- 向下迁移(回滚)
DROP TABLE IF EXISTS audit_logs;
ALTER TABLE users DROP COLUMN IF EXISTS email_verified;
DROP INDEX IF EXISTS idxusersemail;
ALTER TABLE orders
DROP CONSTRAINT fkorderscustomer_id,
ADD CONSTRAINT fkorderscustomeridold
FOREIGN KEY (customer_id) REFERENCES customers(id);
示例3:在CI中检查模式漂移
bash
python3 scripts/main.py check-drift \
--expected schemas/expected/prod.json \
--actual schemas/actual/prod.json \
--fail-on-drift
输出(CI失败):
❌ 检测到模式漂移!
差异:
- 1. 生产环境中存在意外表 tempbackup
- 生产环境中缺少索引 idxordersstatus
- 列 users.lastlogin 类型不同(TIMESTAMP vs TIMESTAMPTZ)
退出代码:1(因 --fail-on-drift 而失败)
示例4:追踪模式演变
bash
python3 scripts/main.py history postgresql://user:pass@host/db --days 90 --format timeline
输出:
📅 模式演变时间线(最近90天)
2024-03-05:添加了audit_logs表(v4.2.0版本发布)
2024-02-28:向users表添加了email_verified列
2024-02-15:为性能优化创建了索引
2024-02-01:为数据完整性添加了外键约束
2024-01-20:初始模式快照(v4.0.0)
📈 变更统计:
- - 表:+3(42 → 45)
- 列:+23(272 → 295)
- 索引:+8(64 → 72)
- 平均每周变更:2.1
示例5:可视化模式比较
bash
python3 scripts/main.py visual-diff schemav1.sql schemav2.sql --html schema_diff.html
输出:
✨ 已生成可视化差异:schema_diff.html
在浏览器中打开可查看:
- - 并排模式比较
- 颜色编码的差异(添加/删除/修改)
- 表的可交互展开/折叠
- 文档导出选项
高亮显示的差异:
✅ 添加了5个表(绿色)
❌ 删除了2个表(红色)
🔄 修改了12列(黄色)
要求
- - Python 3.x
- SQLAlchemy(用于数据库连接)
- Alembic(可选,用于迁移生成)
- 数据库驱动:psycopg2(PostgreSQL)、pymysql(MySQL)等
安装依赖:
bash
pip3 install sqlalchemy alembic psycopg2-binary pymysql
局限性
- - 需要数据库凭据和网络访问权限才能比较实时数据库
- 复杂的模式变更可能需要手动审查生成的迁移
- 对SQLAlchemy未覆盖的数据库特定功能支持有限
- 对于非常大的模式(1000+表)可能会影响性能
- 不支持NoSQL数据库(MongoDB、Redis等)
- 无法比较加密或压缩的数据库转储
- 对连接问题或权限错误的错误处理有限
- 不支持跨所有数据库类型比较物化视图或数据库函数
- 生成的迁移可能无法处理数据迁移或复杂转换
- 不支持分布式数据库比较
- 仅限于模式结构;不比较数据或索引优化
- 对于具有自定义类型或扩展的数据库,可能无法检测所有模式差异
- 不支持跨所有数据库类型比较数据库触发器或存储过程
- 对于非常大的表或复杂关系,性能可能会下降
- 不支持模式版本控制系统(如Liquibase或Flyway)
- 对格式错误的SQL或损坏的模式文件的错误恢复有限
- 不支持实时模式变更监控
- 无法比较不同数据库类型之间的模式(例如PostgreSQL vs MySQL)
- 对数据库特定优化或扩展的支持有限
- 没有用于模式漂移警报的内置通知系统
- 生产环境中可能需要手动调整生成的迁移脚本
目录结构
该工具使用数据库连接字符串、SQL文件或模式快照文件。无需特殊配置目录。
错误处理
- - 无效的数据库连接会显示包含连接详细信息的帮助性错误消息
-