Skills datadog-automation

📦

datadog-automation

Name: datadog-automation
Author: sickn33

Safe 🌐 Network access⚙️ External commands

Automate Datadog Monitoring and Observability Tasks

Managing Datadog monitoring operations manually is time-consuming and error-prone. This skill automates metrics queries, log searches, monitor management, and dashboard operations through Rube MCP integration.

Supports: Claude Codex Code(CC)

📊 71 Adequate

Download the skill ZIP

Upload in Claude

Go to Settings → Capabilities → Skills → Upload skill

Toggle on and start using

Test it

Using "datadog-automation". Query CPU usage for web01 last 5 minutes

Expected outcome:

Retrieved 60 data points showing average CPU usage ranging from 12% to 45%, with current value at 23%. No anomalies detected in the time series.

Using "datadog-automation". Search error logs for payment service

Expected outcome:

Found 127 error logs matching criteria. Top errors: ConnectionTimeout (45%), DatabaseError (30%), ValidationError (25%). Most recent error occurred 2 minutes ago.

Security Audit

Safe

v1 • 2/24/2026

This skill is documentation-only (SKILL.md) describing workflows for Datadog automation via Rube MCP. All 116 static analysis findings are false positives: backtick detections are Markdown code formatting, not shell execution. The hardcoded URL is documentation for MCP server setup. No executable code present. Network and external command risks are managed through the Rube MCP intermediary service with user-authenticated Datadog connections.

Files scanned

241

Lines analyzed

findings

Total audits

Low Risk Issues (1)

SKILL.md:22

Documentation References External Service

Skill documentation references external MCP server endpoint (rube.app) and Datadog API. Users must authenticate separately through official channels.

Risk Factors

🌐 Network access (1)

SKILL.md:22

⚙️ External commands (1)

SKILL.md:17-18

Audited by: claude

Quality Score

Architecture

100

Maintainability

Content

Community

100

Security

Spec Compliance

What You Can Build

DevOps Engineer Incident Response

Quickly query error logs and metrics during incidents, create monitors for new failure patterns, and mute alerts during planned maintenance windows.

SRE Dashboard Management

Create and maintain service health dashboards, set up alert monitors with appropriate thresholds, and manage downtime schedules for deployments.

Platform Team Observability Setup

Automate initial monitoring setup for new services including metric queries, log indexes, baseline monitors, and team dashboards.

Try These Prompts

Basic Metrics Query

Query the average CPU usage for host web01 over the last 5 minutes using Datadog metrics.

Log Error Analysis

Search for all error logs from the payment service in the last hour, sorted by most recent first, limit to 50 entries.

Monitor Creation

Create a metric alert monitor named 'High Memory Usage' that triggers when avg memory usage exceeds 85% on production hosts. Send notifications to the ops-slack channel.

Maintenance Downtime

Schedule a downtime for all hosts with tag env:staging from 2am to 4am UTC tomorrow with the message 'Scheduled deployment maintenance window'.

Best Practices

Always call RUBE_SEARCH_TOOLS first to get current tool schemas before executing workflows
Use specific tag filters in queries to reduce result noise and improve performance
Set explicit end times for downtimes to avoid indefinite alert suppression

Avoid

Do not create monitors without defining clear alert thresholds and notification messages
Avoid querying overly broad time ranges that exceed Datadog retention limits
Do not delete dashboards without confirming backup of widget configurations

Frequently Asked Questions

How do I connect my Datadog account to this skill?

Use the RUBE_MANAGE_CONNECTIONS tool with toolkit 'datadog'. If not connected, follow the returned authentication link to complete Datadog OAuth. Confirm status shows ACTIVE before running workflows.

What timestamp format should I use for queries?

Most endpoints use Unix epoch seconds (not milliseconds). Some endpoints accept ISO 8601 format. Check the specific tool schema for each endpoint requirements.

Can I create custom monitors with this skill?

Yes, you can create metric alerts, log alerts, service checks, and query alerts. Ensure the monitor type matches your query type to avoid creation failures.

How do I handle pagination for large result sets?

Use the page and page_size parameters or offset-based pagination depending on the endpoint. Check the response for total count to determine if more pages exist.

What happens if I hit Datadog rate limits?

Implement exponential backoff on 429 responses. Batch operations where possible and use specific filters to reduce API call volume.

Can I mute a monitor without deleting it?

Yes, use the mute monitor function to temporarily silence notifications. The monitor continues evaluating but will not send alerts until unmuted.

Developer Details

Author

sickn33

License

MIT

Repository

https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/datadog-automation

Ref

main

File structure

📄 SKILL.md