Skills service-mesh-observability
🔭

service-mesh-observability

Safe 🌐 Network access⚙️ External commands📁 Filesystem access

Set up service mesh observability fast

Service mesh telemetry is hard to wire across traces, metrics, and dashboards. This skill provides ready templates and queries for Istio and Linkerd observability.

Supports: Claude Codex Code(CC)
📊 69 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "service-mesh-observability". Give me a concise checklist to enable Istio observability with Prometheus and Jaeger.

Expected outcome:

  • Deploy Prometheus with the Istio scrape config and ServiceMonitor.
  • Enable tracing in Istio and point Zipkin to Jaeger collector.
  • Install Jaeger all-in-one and expose the UI port.
  • Add PromQL panels for request rate, errors, and P99 latency.

Using "service-mesh-observability". How do I check which services are making the most requests?

Expected outcome:

  • Use Linkerd viz top command: linkerd viz top deploy/my-app
  • Or query Prometheus: sum(rate(istio_requests_total[5m])) by (destination_service_name)
  • Check the Grafana Istio dashboards for visualized request rates by service.

Using "service-mesh-observability". Set up alerting for high error rates.

Expected outcome:

  • Create a PrometheusRule with expression: sum(rate(istio_requests_total{response_code=~"5.."}[5m])) by (destination_service_name) / sum(rate(istio_requests_total[5m])) by (destination_service_name) > 0.05
  • Set for: 5m threshold to avoid alert flapping.
  • Label with severity: critical and include service name in summary.

Security Audit

Safe
v4 • 1/17/2026

Pure documentation skill containing YAML templates, PromQL queries, and CLI examples for service mesh observability. All static findings are false positives: the scanner misinterpreted PromQL metric names (containing 'md5', 'sha' substrings) as weak crypto, flagged documentation links as network IOCs, and misidentified YAML field names as path traversal. The content is static documentation that matches its stated purpose exactly.

2
Files scanned
579
Lines analyzed
3
findings
4
Total audits
Audited by: claude View Audit History →

Quality Score

38
Architecture
100
Maintainability
85
Content
21
Community
100
Security
91
Spec Compliance

What You Can Build

Stand up mesh monitoring

Use templates to wire Prometheus, Grafana, and tracing for a new service mesh.

Investigate latency spikes

Apply PromQL queries and tracing setup to locate high latency services.

Define mesh SLOs

Use golden signal guidance to frame SLOs and alert rules for services.

Try These Prompts

Quick start
List the minimal steps and templates to enable Istio metrics and tracing in a new cluster.
Dashboards
Provide the key PromQL queries for request rate, error rate, and P99 latency by service.
Tracing rollout
Give an IstioOperator and Jaeger deployment example for distributed tracing.
Full stack
Combine Prometheus, Grafana, Jaeger, Kiali, and OTel templates into a staged rollout plan.

Best Practices

  • Sample tracing at high rates in dev and lower in production to control costs.
  • Use consistent trace context propagation across all services.
  • Alert on golden signals with clear thresholds defined in PrometheusRule.

Avoid

  • Collecting high cardinality labels without limits on Prometheus.
  • Running 100 percent tracing in production by default.
  • Operating without dashboards for service dependencies and topology.

Frequently Asked Questions

Is this compatible with Istio and Linkerd?
Yes, it provides examples for both Istio and Linkerd observability workflows.
What are the limits of this skill?
It provides templates and guidance only. It does not deploy or validate configurations.
Can I integrate with OpenTelemetry?
Yes, it includes an OpenTelemetry Collector config and Istio Telemetry example.
Does it access my data or credentials?
No, it contains static documentation and does not access files or environment data.
What if my metrics are missing?
Verify Prometheus scrape targets, service labels, and Istio telemetry settings are correct.
How does it compare to vendor tools?
It is vendor neutral and uses common open source components and queries.