Skills service-mesh-observability

🔭

service-mesh-observability

Name: service-mesh-observability
Author: wshobson

Safe 🌐 Network access⚙️ External commands📁 Filesystem access

Set up service mesh observability fast

Also available from: sickn33

Service mesh telemetry is hard to wire across traces, metrics, and dashboards. This skill provides ready templates and queries for Istio and Linkerd observability.

Supports: Claude Codex Code(CC)

📊 69 Adequate

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "service-mesh-observability". Give me a concise checklist to enable Istio observability with Prometheus and Jaeger.

Expected outcome:

Deploy Prometheus with the Istio scrape config and ServiceMonitor.
Enable tracing in Istio and point Zipkin to Jaeger collector.
Install Jaeger all-in-one and expose the UI port.
Add PromQL panels for request rate, errors, and P99 latency.

Using "service-mesh-observability". How do I check which services are making the most requests?

Expected outcome:

Use Linkerd viz top command: linkerd viz top deploy/my-app
Or query Prometheus: sum(rate(istio_requests_total[5m])) by (destination_service_name)
Check the Grafana Istio dashboards for visualized request rates by service.

Using "service-mesh-observability". Set up alerting for high error rates.

Expected outcome:

Create a PrometheusRule with expression: sum(rate(istio_requests_total{response_code=~"5.."}[5m])) by (destination_service_name) / sum(rate(istio_requests_total[5m])) by (destination_service_name) > 0.05
Set for: 5m threshold to avoid alert flapping.
Label with severity: critical and include service name in summary.

Security Audit

Safe

v4 • 1/17/2026

Pure documentation skill containing YAML templates, PromQL queries, and CLI examples for service mesh observability. All static findings are false positives: the scanner misinterpreted PromQL metric names (containing 'md5', 'sha' substrings) as weak crypto, flagged documentation links as network IOCs, and misidentified YAML field names as path traversal. The content is static documentation that matches its stated purpose exactly.

Files scanned

579

Lines analyzed

findings

Total audits

Risk Factors

🌐 Network access (12)

skill-report.json:6 SKILL.md:259 SKILL.md:261 SKILL.md:263 SKILL.md:380 SKILL.md:381 SKILL.md:382 SKILL.md:383 SKILL.md:280 SKILL.md:282 SKILL.md:284 SKILL.md:296

⚙️ External commands (17)

SKILL.md:23-34 SKILL.md:34-49 SKILL.md:49-85 SKILL.md:85-89 SKILL.md:89-109 SKILL.md:109-113 SKILL.md:113-156 SKILL.md:156-160 SKILL.md:160-179 SKILL.md:179-183 SKILL.md:183-240 SKILL.md:240-244 SKILL.md:244-264 SKILL.md:264-268 SKILL.md:268-320 SKILL.md:320-324 SKILL.md:324-361

📁 Filesystem access (1)

SKILL.md:203

Audited by: claude View Audit History →

Quality Score

Architecture

100

Maintainability

Content

Community

100

Security

Spec Compliance

What You Can Build

Stand up mesh monitoring

Use templates to wire Prometheus, Grafana, and tracing for a new service mesh.

Investigate latency spikes

Apply PromQL queries and tracing setup to locate high latency services.

Define mesh SLOs

Use golden signal guidance to frame SLOs and alert rules for services.

Try These Prompts

Quick start

List the minimal steps and templates to enable Istio metrics and tracing in a new cluster.

Dashboards

Provide the key PromQL queries for request rate, error rate, and P99 latency by service.

Tracing rollout

Give an IstioOperator and Jaeger deployment example for distributed tracing.

Full stack

Combine Prometheus, Grafana, Jaeger, Kiali, and OTel templates into a staged rollout plan.

Best Practices

Sample tracing at high rates in dev and lower in production to control costs.
Use consistent trace context propagation across all services.
Alert on golden signals with clear thresholds defined in PrometheusRule.

Avoid

Collecting high cardinality labels without limits on Prometheus.
Running 100 percent tracing in production by default.
Operating without dashboards for service dependencies and topology.

Frequently Asked Questions

Is this compatible with Istio and Linkerd?

Yes, it provides examples for both Istio and Linkerd observability workflows.

What are the limits of this skill?

It provides templates and guidance only. It does not deploy or validate configurations.

Can I integrate with OpenTelemetry?

Yes, it includes an OpenTelemetry Collector config and Istio Telemetry example.

Does it access my data or credentials?

No, it contains static documentation and does not access files or environment data.

What if my metrics are missing?

Verify Prometheus scrape targets, service labels, and Istio telemetry settings are correct.

How does it compare to vendor tools?

It is vendor neutral and uses common open source components and queries.

Developer Details

Author

wshobson

License

MIT

Repository

https://github.com/wshobson/agents/tree/main/plugins/cloud-infrastructure/skills/service-mesh-observability

Ref

main

File structure

📄 SKILL.md