Skills slo-implementation
📈

slo-implementation

Safe 🌐 Network access⚙️ External commands

Define SLOs with error budgets and alerts

Reliability targets are often unclear and hard to measure. This skill provides SLI, SLO, and error budget templates with alerting guidance for implementing SRE practices.

Supports: Claude Codex Code(CC)
📊 69 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "slo-implementation". Create an SLO plan for an API service

Expected outcome:

  • Availability SLO at 99.9 percent over 28 days
  • PromQL SLI ratio for successful requests calculation
  • Error budget policy with actions at 50, 10, and 0 percent remaining
  • Fast burn rate alert threshold at 14.4x over 5 minute window
  • Slow burn rate alert threshold at 6x over 6 hour window

Using "slo-implementation". How do I calculate error budget for a 99.5 percent SLO

Expected outcome:

  • Error budget is 0.5 percent = 3.6 hours per month
  • Error budget formula: 1 minus SLO target
  • Track remaining budget against actual error rate
  • Set alert thresholds for fast and slow burn rates

Using "slo-implementation". What are good SLO targets for a customer facing API

Expected outcome:

  • Availability SLO of 99.9 percent is common for APIs
  • Latency SLO at p99 under 500 milliseconds for most endpoints
  • Error budget of 0.1 percent allows 43 minutes of downtime per month
  • Consider stricter SLOs for critical payment endpoints

Security Audit

Safe
v4 • 1/17/2026

This skill contains only documentation and YAML/PromQL templates. No executable code, file system access, network calls, or command execution. All static findings are false positives where markdown code block delimiters were misidentified as shell commands and percentage values as cryptographic algorithms.

2
Files scanned
521
Lines analyzed
2
findings
4
Total audits
Audited by: claude View Audit History →

Quality Score

38
Architecture
100
Maintainability
85
Content
20
Community
100
Security
91
Spec Compliance

What You Can Build

Set service reliability targets

Define SLOs and error budgets for critical services and align teams on reliability goals.

Add SLO alerts

Create burn rate alerts and recording rules to detect SLO violations early.

Review reliability trends

Use SLO review cadence guidance to track reliability goals across releases.

Try These Prompts

SLO basics
Explain the difference between SLI, SLO, and SLA for a web API and suggest one example SLI.
SLO target setup
Propose a 99.9 percent availability SLO for an API and show a PromQL SLI expression.
Error budget policy
Create an error budget policy with actions at 50, 10, and 0 percent remaining.
Burn rate alerts
Draft multi-window burn rate alerts for a 99.9 percent availability SLO using Prometheus rules.

Best Practices

  • Start with user-facing services and simple SLIs that reflect user experience
  • Use multi-window burn rate alerts to reduce alert noise while detecting real issues
  • Review SLOs on a regular cadence and adjust targets based on actual performance

Avoid

  • Setting 100 percent SLOs without error budget leaves no room for innovation
  • Using only internal metrics that do not reflect actual user impact
  • Ignoring error budget status when shipping risky changes causes reliability incidents

Frequently Asked Questions

What platforms does this skill work with?
It is platform agnostic but examples use Prometheus query syntax and Grafana dashboard patterns.
What are the limits of this skill?
It provides guidance and templates only, not automated SLO calculations or deployments.
Can I integrate this with existing monitoring?
Yes, map the SLI formulas to your current metrics and adapt the YAML templates to your alerting system.
Does this skill access or store my data?
No, it only provides static guidance and examples without any data access.
What if my SLOs are frequently violated?
Consider lowering targets, refining SLIs to be more achievable, and using error budget actions to prioritize reliability work.
How does this compare to vendor SLO tools?
It is a lightweight framework that complements vendor SLO products with clear definitions and implementation patterns.

Developer Details

File structure

📄 SKILL.md