Defines golden signals, SLO windows, and dashboard checks before agents automate deploys—so assistants know what "healthy" means instead of guessing from noisy logs.
Use cases
- On-call prep
- Canary analysis
- Post-deploy verification
Key features
- Pick SLIs tied to user pain
- Set error/latency budgets
- Wire alerts to runbooks
Related
Related
3 Indexed items
Incident response
Structures on-call work: timeline, blast radius, mitigations, and customer comms—so fixes stay coordinated instead of chaotic thread hopping.
Performance profiling
Finds real bottlenecks using traces, flame graphs, and system metrics before rewriting code—so optimizations target measured latency, not guesses.
Structured logging
Defines a small set of log fields (request id, user id, feature flag, latency bucket) so production debugging does not depend on grep across inconsistent printf strings.