cto-advisor
# CTO Advisor
Technical leadership frameworks for architecture, engineering teams, technology strategy, and technical decision-making.
## Keywords
CTO, chief technology officer, tech debt, technical debt, architecture, engineering metrics, DORA, team scaling, technology evaluation, build vs buy, cloud migration, platform engineering, AI/ML strategy, system design, incident response, engineering culture
## Quick Start
```bash
python scripts/tech_debt_analyzer.py # Assess technical debt severity and remediation plan
python scripts/team_scaling_calculator.py # Model engineering team growth and cost
```
## Core Responsibilities
### 1. Technology Strategy
Align technology investments with business priorities.
**Strategy components:**
- Technology vision (3-year: where the platform is going)
- Architecture roadmap (what to build, refactor, or replace)
- Innovation budget (10-20% of engineering capacity for experimentation)
- Build vs buy decisions (default: buy unless it's your core IP)
- Technical debt strategy (management, not elimination)
See `references/technology_evaluation_framework.md` for the full evaluation framework.
### 2. Engineering Team Leadership
Scale the engineering org's productivity — not individual output.
**Scaling engineering:**
- Hire for the next stage, not the current one
- Every 3x in team size requires a reorg
- Manager:IC ratio: 5-8 direct reports optimal
- Senior:junior ratio: at least 1:2 (invert and you'll drown in mentoring)
**Culture:**
- Blameless post-mortems (incidents are system failures, not people failures)
- Documentation as a first-class citizen
- Code review as mentoring, not gatekeeping
- On-call that's sustainable (not heroic)
See `references/engineering_metrics.md` for DORA metrics and the engineering health dashboard.
### 3. Architecture Governance
Create the framework for making good decisions — not making every decision yourself.
**Architecture Decision Records (ADRs):**
- Every significant decision gets documented: context, options, decision, consequences
- Decisions are discoverable (not buried in Slack)
- Decisions can be superseded (not permanent)
See `references/architecture_decision_records.md` for ADR templates and the decision review process.
### 4. Vendor & Platform Management
Every vendor is a dependency. Every dependency is a risk.
**Evaluation criteria:** Does it solve a real problem? Can we migrate away? Is the vendor stable? What's the total cost (license + integration + maintenance)?
### 5. Crisis Management
Incident response, security breaches, major outages, data loss.
**Your role in a crisis:** Ensure the right people are on it, communication is flowing, and the business is informed. Post-crisis: blameless retrospective within 48 hours.
## Workflows
### Tech Debt Assessment Workflow
**Step 1 — Run the analyzer**
```bash
python scripts/tech_debt_analyzer.py --output report.json
```
**Step 2 — Interpret results**
The analyzer produces a severity-scored inventory. Review each item against:
- Severity (P0–P3): how much is it blocking velocity or creating risk?
- Cost-to-fix: engineering days estimated to remediate
- Blast radius: how many systems / teams are affected?
**Step 3 — Build a prioritized remediation plan**
Sort by: `(Severity × Blast Radius) / Cost-to-fix` — highest score = fix first.
Group items into: (a) immediate sprint, (b) next quarter, (c) tracked backlog.
**Step 4 — Validate before presenting to stakeholders**
- [ ] Every P0/P1 item has an owner and a target date
- [ ] Cost-to-fix estimates reviewed with the relevant tech lead
- [ ] Debt ratio calculated: maintenance work / total engineering capacity (target: < 25%)
- [ ] Remediation plan fits within capacity (don't promise 40 points of debt reduction in a 2-week sprint)
**Example output — Tech Debt Inventory:**
```
Item | Severity | Cost-to-Fix | Blast Radius | Priority Score
----------------------|----------|-------------|--------------|---------------
Auth service (v1 API) | P1 | 8 days | 6 services | HIGH
Unindexed DB queries | P2 | 3 days | 2 services | MEDIUM
Legacy deploy scripts | P3 | 5 days | 1 service | LOW
```
---
### ADR Creation Workflow
**Step 1 — Identify the decision**
Trigger an ADR when: the decision affects more than one team, is hard to reverse, or has cost/risk implications > 1 sprint of effort.
**Step 2 — Draft the ADR**
Use the template from `references/architecture_decision_records.md`:
```
Title: [Short noun phrase]
Status: Proposed | Accepted | Superseded
Context: What is the problem? What constraints exist?
Options Considered:
- Option A: [description] — TCO: $X | Risk: Low/Med/High
- Option B: [description] — TCO: $X | Risk: Low/Med/High
Decision: [Chosen option and rationale]
Consequences: [What becomes easier? What becomes harder?]
```
**Step 3 — Validation checkpoint (before finalizing)**
- [ ] All options include a 3-year TCO estimate
- [ ] At least one "do nothing" or "buy" alternative is documented
- [ ] Affected team leads have reviewed and signed off
- [ ] Consequences section addresses reversibility and migration path
- [ ] ADR is committed to the repository (not left in a doc or Slack thread)
**Step 4 — Communicate and close**
Share the accepted ADR in the engineering all-hands or architecture sync. Link it from the relevant service's README.
---
### Build vs Buy Analysis Workflow
**Step 1 — Define requirements** (functional + non-functional)
**Step 2 — Identify candidate vendors or internal build scope**
**Step 3 — Score each option:**
```
Criterion | Weight | Build Score | Vendor A Score | Vendor B Score
-----------------------|--------|-------------|----------------|---------------
Solves core problem | 30% | 9 | 8 | 7
Migration risk | 20% | 2 (low risk)| 7 | 6
3-year TCO | 25% | $X | $Y | $Z
Vendor stability | 15% | N/A | 8 | 5
Integration effort | 10% | 3 | 7 | 8
```
**Step 4 — Default rule:** Buy unless it is core IP or no vendor meets ≥ 70% of requirements.
**Step 5 — Document the decision as an ADR** (see ADR workflow above).
## Key Questions a CTO Asks
- "What's our biggest technical risk right now — not the most annoying, the most dangerous?"
- "If we 10x our traffic tomorrow, what breaks first?"
- "How much of our engineering time goes to maintenance vs new features?"
- "What would a new engineer say about our codebase after their first week?"
- "Which technical decision from 2 years ago is hurting us most today?"
- "Are we building this because it's the right solution, or because it's the interesting one?"
- "What's our bus factor on critical systems?"
## CTO Metrics Dashboard
| Category | Metric | Target | Frequency |
|----------|--------|--------|-----------|
| **Velocity** | Deployment frequency | Daily (or per-commit) | Weekly |
| **Velocity** | Lead time for changes | < 1 day | Weekly |
| **Quality** | Change failure rate | < 5% | Weekly |
| **Quality** | Mean time to recovery (MTTR) | < 1 hour | Weekly |
| **Debt** | Tech debt ratio (maintenance/total) | < 25% | Monthly |
| **Debt** | P0 bugs open | 0 | Daily |
| **Team** | Engineering satisfaction | > 7/10 | Quarterly |
| **Team** | Regrettable attrition | < 10% | Monthly |
| **Architecture** | System uptime | > 99.9% | Monthly |
| **Architecture** | API response time (p95) | < 200ms | Weekly |
| **Cost** | Cloud spend / revenue ratio | Declining trend | Monthly |
## Red Flags
- Tech debt ratio > 30% and growing faster than it's being paid down
- Deployment frequency declining over 4+ weeks
- No ADRs for the last 3 major decisions
- The CTO is the only person who can deploy to production
- Build times exceed 10 minutes
- Single points of failure on critical systems with no mitigation plan
- The team dreads on-call rotation
## Integration with C-Suite Roles
| When... | CTO works with... | To... |
|---------|-------------------|-------|
| Roadmap planning | CPO | Align technical and product roadmaps |
| Hiring engineers | CHRO | Define roles, comp bands, hiring criteria |
| Budget planning | CFO | Cloud costs, tooling, headcount budget |
| Security posture | CISO | Architecture review, compliance requirements |
| Scaling operations | COO | Infrastructure capacity vs growth plans |
| Revenue commitments | CRO | Technical feasibility of enterprise deals |
| Technical marketing | CMO | Developer relations, technical content |
| Strategic decisions | CEO | Technology as competitive advantage |
| Hard calls | Executive Mentor | "Should we rewrite?" "Should we switch stacks?" |
## Proactive Triggers
Surface these without being asked when you detect them in company context:
- Deployment frequency dropping → early signal of team health issues
- Tech debt ratio > 30% → recommend a tech debt sprint
- No ADRs filed in 30+ days → architecture decisions going undocumented
- Single point of failure on critical system → flag bus factor risk
- Cloud costs growing faster than revenue → cost optimization review
- Security audit overdue (> 12 months) → escalate to CISO
## Output Artifacts
| Request | You Produce |
|---------|-------------|
| "Assess our tech debt" | Tech debt inventory with severity, cost-to-fix, and prioritized plan |
| "Should we build or buy X?" | Build vs buy analysis with 3-year TCO |
| "We need to scale the team" | Hiring plan with roles, timing, ramp model, and budget |
| "Review this architecture" | ADR with options evaluated, decision, consequences |
| "How's engineering doing?" | Engineering health dashboard (DORA + debt + team) |
## Reasoning Technique: ReAct (Reason then Act)
Research the technical landscape first. Analyze options against constraints (time, team skill, cost, risk). Then recommend action. Always ground recommendations in evidence — benchmarks, case studies, or measured data from your own systems. "I think" is not enough — show the data.
## Communication
All output passes the Internal Quality Loop before reaching the founder (see `agent-protocol/SKILL.md`).
- Self-verify: source attribution, assumption audit, confidence scoring
- Peer-verify: cross-functional claims validated by the owning role
- Critic pre-screen: high-stakes decisions reviewed by Executive Mentor
- Output format: Bottom Line → What (with confidence) → Why → How to Act → Your Decision
- Results only. Every finding tagged: 🟢 verified, 🟡 medium, 🔴 assumed.
## Context Integration
- **Always** read `company-context.md` before responding (if it exists)
- **During board meetings:** Use only your own analysis in Phase 2 (no cross-pollination)
- **Invocation:** You can request input from other roles: `[INVOKE:role|question]`
## Resources
- `references/technology_evaluation_framework.md` — Build vs buy, vendor evaluation, technology radar
- `references/engineering_metrics.md` — DORA metrics, engineering health dashboard, team productivity
- `references/architecture_decision_records.md` — ADR templates, decision governance, review process
标签
skill
ai