Runs the minimal set of checks—tests, builds, manual verifications, or environment-specific validations—that confirm a task is truly complete before it is marked done. This practice prevents the common pattern where 'done' means 'written' rather than 'working in production,' and it creates a shared definition of completion across the team.
Use cases
- Wrapping up a feature ticket that has passed code review but has not been smoke-tested in the target environment
- Completing a dependency upgrade or configuration change that could behave differently in staging versus locally
- Handing off work to a teammate and wanting to ensure the integration points are actually working
- Before merging a pull request that touches infrastructure or deployment configuration
- After applying a hotfix directly to production and needing to confirm the fix resolved the incident
Key features
- List all the verification commands or actions that would prove the task works end-to-end in its target environment
- Run each verification step and capture the actual output, comparing it to the expected output rather than assuming success
- Inspect log output and error traces for any unexpected warnings or degraded behavior even if the primary check passes
- Mark the task complete only after all verification steps produce the expected results, not just the happy path
When to Use This Skill
- When a feature has passed code review but has not been tested in the target deployment environment
- When handing off work to another engineer and wanting to ensure integration points are verified
- When a change touches infrastructure, configuration, or deployment pipelines where local behavior may differ from production
Expected Output
A verification checklist with each step marked pass or fail, and the actual output or screenshot that confirms the task is complete.
Frequently Asked Questions
- How do I define the right verification steps for a task?
- Ask: what would a skeptical reviewer need to see to believe this is done? Include at least one step that runs the code end-to-end, not just unit tests or linter checks.
- What if verification requires credentials or infrastructure I do not have access to?
- Escalate before marking the task done. Having a different person verify is still better than shipping unverified work. Document any verification gaps as follow-up items.
- Does this apply to small, low-risk changes?
- Yes—but the verification set can be small. Even running the relevant unit tests and checking that the feature flag can be toggled counts as verification for low-risk changes.
Related
Related
3 Indexed items
Evaluation and benchmarking
Builds evaluation suites with ground-truth answers, automated scoring, and regression detection so you can measure whether model or prompt changes actually improve outcomes before shipping. Without systematic evaluation, teams ship changes that seem better anecdotally but may degrade specific edge cases silently.
Finishing a development branch
Systematically closes out a development branch by running verification, cleaning up the commit history, pushing with proper tracking, and making an explicit choice between merge, squash, or follow-up tickets. This prevents the common pattern of abandoned branches, stale PRs, and lost context when work is not deliberately concluded.
Observability baselines
Establishes golden signals (latency, traffic, errors, saturation), SLO windows, and dashboard checks before agents automate deployments so that 'healthy' and 'degraded' have measurable definitions rather than subjective interpretations. This is essential when AI agents are managing deploys because agents need objective metrics to make decisions, not human gut feelings.