| Component | Investment | New use case impact | Effort per case |
|---|---|---|---|
| Platform engine — build once, applies to every use case forever | |||
| State machine + orchestrator 10 states, 4 phases, run lifecycle | one-time build | No change. Same engine for any ticket, service, or domain. | Zero |
| Policy guard (28 gates) Authority, safety, budget, approval | one-time build | No change. Gate logic is universal. Thresholds are config. | Zero |
| Agent runtime + THINK step ReAct loop, evidence chain, budgets | one-time build | No change. Executes any agent role for any problem. | Zero |
| MCP gateway + tool execution Dispatch, sanitize, audit, idempotency | one-time build | No change. Runs any registered tool. | Zero |
| Content pipeline + retrieval Ingest, chunk, embed, hybrid search, rerank | one-time build | No change. Processes any document, searches any content. | Zero |
| Knowledge creator + feedback loop Episode, KB draft, skill draft, graph update | one-time build | No change. Learns from any resolved ticket. | Zero |
| UI, streaming, observability Live feed, audit, tracing, metrics | one-time build | No change. Renders any event from any investigation. | Zero |
| Crash recovery + caching Watchdog, heartbeat, 9 cache layers | one-time build | No change. | Zero |
| Data sanitizer (PII/credentials) Pre-LLM scanning and redaction | one-time build | No change. Patterns are universal. | Zero |
| Zero-knowledge strategy Cold start, architecture discovery | one-time build | No change. Works for any unknown service automatically. | Zero |
| Per-use-case configuration — add a new service in under 2 hours, no code | |||
| Service connection registry How to connect: host, auth, method | configure | One database row per service: connection type, host, auth reference, allowed commands, log locations. | 15 – 30 min |
| Command templates Diagnostic + remediation commands | configure | JSON entries: command ID, parameterized template, risk level. Agent never writes raw bash. | 15 – 60 min |
| Graph seed data Service, team, dependencies | configure | CSV upload or API call. After first ticket resolves, graph grows automatically. | 15 – 30 min |
| KB documents (runbooks) Troubleshooting guides, SOPs | configure | Import from Confluence/wiki or write new. Optional day 1 — system works without them. | 0 – 4 hr |
| Skills (SKILL.md) Procedural investigation guidance | configure | Write or review auto-generated drafts. Optional day 1 — system runs skillless. | 0 – 8 hr |
| Policy overrides Thresholds, approvals, time windows | configure | YAML or DB row per tenant/service. Optional — defaults work. | 0 – 10 min |
| Metric baselines Normal CPU, memory, connections | configure | DB row per service + metric. Can auto-learn from monitoring history. | 0 – 10 min |
| Self-learning — improves with every resolved ticket, zero ongoing effort | |||
| Episodic memory Past incident records | auto-learns | Every resolution becomes a reference for future agents. Dead ends are remembered. | Zero |
| Graph failure patterns Failure frequency, fix success rates | auto-learns | Agent tries the most common failure first. Fix success rates guide resolution. | Zero |
| KB + skill auto-drafts Runbooks and skills from resolutions | auto-learns | System drafts new docs after every skillless run. Human reviews and publishes. | 15 min review |
| Quality scores KB article effectiveness tracking | auto-learns | Helpful articles get boosted. Misleading articles get penalized and flagged. | Zero |
| Confidence calibration Per-service threshold tuning | auto-learns | After 5+ runs, thresholds auto-adjust based on actual accuracy per service. | Zero |
| Repository profiles Code structure maps | auto-learns | Auto-generated from GitHub repo scanning. Improves code analysis accuracy. | Zero |
| Investigation patterns Optimal diagnostic sequences | auto-learns | Weekly analysis discovers recurring patterns. Suggests new skills automatically. | Zero |