systematic-debugging
Debug systematically to find root causes before fixing
Also available from: Cygnusfear,Cycleaddict,DMJGilbert,Doyajin174,obra,Asmayaseen,DYAI2025,ChrisWiles,davila7,CodingCossack
Random fixes waste time and create new bugs. This skill provides a four-phase debugging framework that finds root causes before implementing solutions, preventing wasted effort on symptom-based fixes.
Download the skill ZIP
Upload in Claude
Go to Settings → Capabilities → Skills → Upload skill
Toggle on and start using
Test it
Using "systematic-debugging". Test fails with git init error in wrong directory
Expected outcome:
- Phase 1 - Root Cause Investigation:
- 1. Error shows git init running in /source/code not /tmp
- 2. Traced backward: git init <- WorktreeManager <- Session.create <- test
- 3. Found: projectDir was empty string, resolved to process.cwd()
- 4. Root cause: test accessed context.tempDir before beforeEach initialized it
- Root cause identified: variable initialization order issue
Using "systematic-debugging". Flaky test that sometimes misses tool results
Expected outcome:
- Diagnosis: Test uses arbitrary setTimeout waits instead of condition checking
- Solution: Replace timeouts with condition-based waiting
- BEFORE: await new Promise(r => setTimeout(r, 300))
- AFTER: await waitForEventCount(threadManager, id, 'TOOL_RESULT', 2)
- Result: 60% pass rate improved to 100%
Security Audit
Low RiskStatic analyzer flagged 126 patterns but 125 are false positives from markdown documentation (backticks are code formatting, not shell execution). One legitimate shell script (find-polluter.sh) uses npm test for test debugging - acceptable for a debugging skill. No cryptographic code exists despite scanner claims. Skill teaches debugging methodology safely.
Low Risk Issues (1)
Risk Factors
⚙️ External commands (1)
Quality Score
What You Can Build
Test Failure Investigation
When tests fail intermittently or unexpectedly, use this skill to systematically trace the failure back to its root cause rather than applying quick fixes.
Production Bug Resolution
When bugs appear in production, follow the four-phase process to understand the root cause before deploying fixes that might create new issues.
Multi-Component Debugging
When debugging complex systems with multiple layers, use defense-in-depth diagnostics to identify exactly which component is failing.
Try These Prompts
I'm seeing this error: [paste error]. I want to fix it quickly, but I know I should investigate first. Help me follow Phase 1 of systematic debugging to understand the root cause before proposing any fixes.
This test fails intermittently: [paste test]. Sometimes it passes, sometimes it fails. Guide me through systematic debugging to find what's causing the flakiness, including how to add diagnostic instrumentation.
My system has [describe layers: e.g., CI workflow, build script, signing process]. Something is breaking but I don't know which layer. Show me how to add evidence-gathering instrumentation at each boundary to isolate the failing component.
I've tried [N] fixes already and none worked. Each fix revealed a new problem. Help me stop and question whether there's an architectural issue before attempting another fix. Guide me through re-analyzing from Phase 1.
Best Practices
- Always complete Phase 1 investigation before proposing any fix - understanding the root cause is faster than thrashing with random fixes
- Trace bugs backward through the call stack to find the original trigger, not just where the error appears
- After 3 failed fix attempts, stop and question the architecture rather than attempting more symptom fixes
Avoid
- Proposing fixes before completing root cause investigation - this treats symptoms, not causes
- Making multiple changes at once - you cannot isolate what worked and may introduce new bugs
- Skipping the failing test creation - untested fixes do not stick and regressions go unnoticed
Frequently Asked Questions
Isn't systematic debugging slower than just trying quick fixes?
What if I'm in an emergency and need a fix now?
How do I know when I've found the real root cause?
What if my first hypothesis is wrong?
When should I question the architecture versus keep debugging?
Does this work for non-code bugs like configuration issues?
Developer Details
Author
ZhanlinCuiLicense
MIT
Repository
https://github.com/ZhanlinCui/Ultimate-Agent-Skills-Collection/tree/main/systematic-debuggingRef
main