Set up comprehensive monitoring, tracing, and alerting for your service mesh deployments. Get ready-to-use configurations for Istio, Linkerd, Prometheus, Grafana, and Jaeger.
تنزيل ZIP المهارة
رفع في Claude
اذهب إلى Settings → Capabilities → Skills → Upload skill
فعّل وابدأ الاستخدام
اختبرها
استخدام "service-mesh-observability". Generate Prometheus config for Istio metrics
النتيجة المتوقعة:
YAML ServiceMonitor with scrape configs targeting istiod endpoints, 15 second intervals, and relabel rules for mesh discovery.
استخدام "service-mesh-observability". Create alert for high latency
النتيجة المتوقعة:
PrometheusRule with histogram_quantile expression for P99 latency threshold, 5 minute evaluation window, and warning severity annotation.
التدقيق الأمني
آمنThis skill is a documentation-only guide for service mesh observability. Static analysis flagged 55 patterns, but all are false positives: backtick commands are markdown code blocks (not execution), hardcoded URLs/IPs are configuration examples, and crypto warnings are triggered by YAML config snippets. No actual code execution, network calls, or filesystem operations occur.
درجة الجودة
ماذا يمكنك بناءه
Platform Engineer Setting Up Mesh Monitoring
Deploy complete observability stack for new Istio installation with Prometheus, Grafana, and Jaeger integration.
SRE Debugging Production Latency Issues
Query distributed traces to identify bottlenecks across microservices and set up P99 latency alerts.
DevOps Team Implementing SLOs
Define and monitor service level objectives for mesh traffic with automated alerting on error rate thresholds.
جرّب هذه الموجهات
Generate a Prometheus ServiceMonitor configuration for scraping Istio mesh metrics with 15 second intervals.
Create a Jaeger deployment manifest for Istio tracing with 100 percent sampling for development environments.
Build a Grafana dashboard JSON with panels for request rate, error rate, P99 latency, and service topology for Istio.
Write PrometheusRule alerts for high error rate above 5 percent and P99 latency over 1 second with appropriate severity labels.
أفضل الممارسات
- Sample traces at 100 percent in development but reduce to 1-10 percent in production to control storage costs
- Configure alerting on golden signals: latency, traffic, errors, and saturation with appropriate thresholds
- Use trace context propagation consistently across all services for complete request visibility
تجنب
- Over-sampling traces in production leading to excessive storage costs and performance overhead
- Ignoring metric cardinality limits causing Prometheus memory issues and slow queries
- Deploying observability tools without dashboards or alerts that provide actionable insights
الأسئلة المتكررة
What service meshes does this skill support?
Does this skill deploy resources to my cluster?
What sampling rate should I use for tracing?
Can I use this with managed service meshes?
How do I correlate metrics with traces?
What are the golden signals for service mesh?
تفاصيل المطور
المؤلف
sickn33الترخيص
MIT
المستودع
https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/service-mesh-observabilityمرجع
main
بنية الملفات
📄 SKILL.md