為您的服務網格部署設置全面的監控、追蹤和告警。取得適用於 Istio、Linkerd、Prometheus、Grafana 和 Jaeger 的可即用配置。
下載技能 ZIP
在 Claude 中上傳
前往 設定 → 功能 → 技能 → 上傳技能
開啟並開始使用
測試它
正在使用「service-mesh-observability」。 Generate Prometheus config for Istio metrics
預期結果:
YAML ServiceMonitor with scrape configs targeting istiod endpoints, 15 second intervals, and relabel rules for mesh discovery.
正在使用「service-mesh-observability」。 Create alert for high latency
預期結果:
PrometheusRule with histogram_quantile expression for P99 latency threshold, 5 minute evaluation window, and warning severity annotation.
安全審計
安全This skill is a documentation-only guide for service mesh observability. Static analysis flagged 55 patterns, but all are false positives: backtick commands are markdown code blocks (not execution), hardcoded URLs/IPs are configuration examples, and crypto warnings are triggered by YAML config snippets. No actual code execution, network calls, or filesystem operations occur.
品質評分
你能建構什麼
Platform Engineer Setting Up Mesh Monitoring
Deploy complete observability stack for new Istio installation with Prometheus, Grafana, and Jaeger integration.
SRE Debugging Production Latency Issues
Query distributed traces to identify bottlenecks across microservices and set up P99 latency alerts.
DevOps Team Implementing SLOs
Define and monitor service level objectives for mesh traffic with automated alerting on error rate thresholds.
試試這些提示
Generate a Prometheus ServiceMonitor configuration for scraping Istio mesh metrics with 15 second intervals.
Create a Jaeger deployment manifest for Istio tracing with 100 percent sampling for development environments.
Build a Grafana dashboard JSON with panels for request rate, error rate, P99 latency, and service topology for Istio.
Write PrometheusRule alerts for high error rate above 5 percent and P99 latency over 1 second with appropriate severity labels.
最佳實務
- Sample traces at 100 percent in development but reduce to 1-10 percent in production to control storage costs
- Configure alerting on golden signals: latency, traffic, errors, and saturation with appropriate thresholds
- Use trace context propagation consistently across all services for complete request visibility
避免
- Over-sampling traces in production leading to excessive storage costs and performance overhead
- Ignoring metric cardinality limits causing Prometheus memory issues and slow queries
- Deploying observability tools without dashboards or alerts that provide actionable insights
常見問題
What service meshes does this skill support?
Does this skill deploy resources to my cluster?
What sampling rate should I use for tracing?
Can I use this with managed service meshes?
How do I correlate metrics with traces?
What are the golden signals for service mesh?
開發者詳情
作者
sickn33授權
MIT
儲存庫
https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/service-mesh-observability引用
main
檔案結構
📄 SKILL.md