Skills slo-implementation

📈

slo-implementation

Name: slo-implementation
Author: wshobson

Safe 🌐 Network access⚙️ External commands

Define SLOs with error budgets and alerts

Also available from: sickn33

Reliability targets are often unclear and hard to measure. This skill provides SLI, SLO, and error budget templates with alerting guidance for implementing SRE practices.

Supports: Claude Codex Code(CC)

📊 69 Adequate

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "slo-implementation". Create an SLO plan for an API service

Expected outcome:

Availability SLO at 99.9 percent over 28 days
PromQL SLI ratio for successful requests calculation
Error budget policy with actions at 50, 10, and 0 percent remaining
Fast burn rate alert threshold at 14.4x over 5 minute window
Slow burn rate alert threshold at 6x over 6 hour window

Using "slo-implementation". How do I calculate error budget for a 99.5 percent SLO

Expected outcome:

Error budget is 0.5 percent = 3.6 hours per month
Error budget formula: 1 minus SLO target
Track remaining budget against actual error rate
Set alert thresholds for fast and slow burn rates

Using "slo-implementation". What are good SLO targets for a customer facing API

Expected outcome:

Availability SLO of 99.9 percent is common for APIs
Latency SLO at p99 under 500 milliseconds for most endpoints
Error budget of 0.1 percent allows 43 minutes of downtime per month
Consider stricter SLOs for critical payment endpoints

Security Audit

Safe

v4 • 1/17/2026

This skill contains only documentation and YAML/PromQL templates. No executable code, file system access, network calls, or command execution. All static findings are false positives where markdown code block delimiters were misidentified as shell commands and percentage values as cryptographic algorithms.

Files scanned

521

Lines analyzed

findings

Total audits

Audited by: claude View Audit History →

Quality Score

Architecture

100

Maintainability

Content

Community

100

Security

Spec Compliance

What You Can Build

Set service reliability targets

Define SLOs and error budgets for critical services and align teams on reliability goals.

Add SLO alerts

Create burn rate alerts and recording rules to detect SLO violations early.

Review reliability trends

Use SLO review cadence guidance to track reliability goals across releases.

Try These Prompts

SLO basics

Explain the difference between SLI, SLO, and SLA for a web API and suggest one example SLI.

SLO target setup

Propose a 99.9 percent availability SLO for an API and show a PromQL SLI expression.

Error budget policy

Create an error budget policy with actions at 50, 10, and 0 percent remaining.

Burn rate alerts

Draft multi-window burn rate alerts for a 99.9 percent availability SLO using Prometheus rules.

Best Practices

Start with user-facing services and simple SLIs that reflect user experience
Use multi-window burn rate alerts to reduce alert noise while detecting real issues
Review SLOs on a regular cadence and adjust targets based on actual performance

Avoid

Setting 100 percent SLOs without error budget leaves no room for innovation
Using only internal metrics that do not reflect actual user impact
Ignoring error budget status when shipping risky changes causes reliability incidents

Frequently Asked Questions

What platforms does this skill work with?

It is platform agnostic but examples use Prometheus query syntax and Grafana dashboard patterns.

What are the limits of this skill?

It provides guidance and templates only, not automated SLO calculations or deployments.

Can I integrate this with existing monitoring?

Yes, map the SLI formulas to your current metrics and adapt the YAML templates to your alerting system.

Does this skill access or store my data?

No, it only provides static guidance and examples without any data access.

What if my SLOs are frequently violated?

Consider lowering targets, refining SLIs to be more achievable, and using error budget actions to prioritize reliability work.

How does this compare to vendor SLO tools?

It is a lightweight framework that complements vendor SLO products with clear definitions and implementation patterns.

Developer Details

Author

wshobson

License

MIT

Repository

https://github.com/wshobson/agents/tree/main/plugins/observability-monitoring/skills/slo-implementation

Ref

main

File structure

📄 SKILL.md

slo-implementation

Test it

Security Audit

Risk Factors

Quality Score

What You Can Build

Set service reliability targets

Add SLO alerts

Review reliability trends

Try These Prompts

Best Practices

Avoid

Frequently Asked Questions

Developer Details