Skills datadog-automation
📦

datadog-automation

Safe 🌐 Network access⚙️ External commands

Automate Datadog Monitoring and Observability Tasks

Managing Datadog monitoring operations manually is time-consuming and error-prone. This skill automates metrics queries, log searches, monitor management, and dashboard operations through Rube MCP integration.

Supports: Claude Codex Code(CC)
📊 71 Adequate
1

Download the skill ZIP

2

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

3

Toggle on and start using

Test it

Using "datadog-automation". Query CPU usage for web01 last 5 minutes

Expected outcome:

Retrieved 60 data points showing average CPU usage ranging from 12% to 45%, with current value at 23%. No anomalies detected in the time series.

Using "datadog-automation". Search error logs for payment service

Expected outcome:

Found 127 error logs matching criteria. Top errors: ConnectionTimeout (45%), DatabaseError (30%), ValidationError (25%). Most recent error occurred 2 minutes ago.

Security Audit

Safe
v1 • 2/24/2026

This skill is documentation-only (SKILL.md) describing workflows for Datadog automation via Rube MCP. All 116 static analysis findings are false positives: backtick detections are Markdown code formatting, not shell execution. The hardcoded URL is documentation for MCP server setup. No executable code present. Network and external command risks are managed through the Rube MCP intermediary service with user-authenticated Datadog connections.

1
Files scanned
241
Lines analyzed
3
findings
1
Total audits
Low Risk Issues (1)
Documentation References External Service
Skill documentation references external MCP server endpoint (rube.app) and Datadog API. Users must authenticate separately through official channels.

Risk Factors

🌐 Network access (1)
⚙️ External commands (1)
Audited by: claude

Quality Score

38
Architecture
100
Maintainability
87
Content
32
Community
100
Security
91
Spec Compliance

What You Can Build

DevOps Engineer Incident Response

Quickly query error logs and metrics during incidents, create monitors for new failure patterns, and mute alerts during planned maintenance windows.

SRE Dashboard Management

Create and maintain service health dashboards, set up alert monitors with appropriate thresholds, and manage downtime schedules for deployments.

Platform Team Observability Setup

Automate initial monitoring setup for new services including metric queries, log indexes, baseline monitors, and team dashboards.

Try These Prompts

Basic Metrics Query
Query the average CPU usage for host web01 over the last 5 minutes using Datadog metrics.
Log Error Analysis
Search for all error logs from the payment service in the last hour, sorted by most recent first, limit to 50 entries.
Monitor Creation
Create a metric alert monitor named 'High Memory Usage' that triggers when avg memory usage exceeds 85% on production hosts. Send notifications to the ops-slack channel.
Maintenance Downtime
Schedule a downtime for all hosts with tag env:staging from 2am to 4am UTC tomorrow with the message 'Scheduled deployment maintenance window'.

Best Practices

  • Always call RUBE_SEARCH_TOOLS first to get current tool schemas before executing workflows
  • Use specific tag filters in queries to reduce result noise and improve performance
  • Set explicit end times for downtimes to avoid indefinite alert suppression

Avoid

  • Do not create monitors without defining clear alert thresholds and notification messages
  • Avoid querying overly broad time ranges that exceed Datadog retention limits
  • Do not delete dashboards without confirming backup of widget configurations

Frequently Asked Questions

How do I connect my Datadog account to this skill?
Use the RUBE_MANAGE_CONNECTIONS tool with toolkit 'datadog'. If not connected, follow the returned authentication link to complete Datadog OAuth. Confirm status shows ACTIVE before running workflows.
What timestamp format should I use for queries?
Most endpoints use Unix epoch seconds (not milliseconds). Some endpoints accept ISO 8601 format. Check the specific tool schema for each endpoint requirements.
Can I create custom monitors with this skill?
Yes, you can create metric alerts, log alerts, service checks, and query alerts. Ensure the monitor type matches your query type to avoid creation failures.
How do I handle pagination for large result sets?
Use the page and page_size parameters or offset-based pagination depending on the endpoint. Check the response for total count to determine if more pages exist.
What happens if I hit Datadog rate limits?
Implement exponential backoff on 429 responses. Batch operations where possible and use specific filters to reduce API call volume.
Can I mute a monitor without deleting it?
Yes, use the mute monitor function to temporarily silence notifications. The monitor continues evaluating but will not send alerts until unmuted.

Developer Details

File structure

📄 SKILL.md