技能 azure-ai-contentsafety-ts
📦

azure-ai-contentsafety-ts

低風險 🌐 網路存取🔑 環境變數

Moderate harmful content with Azure AI Content Safety

Protect your platform from harmful user-generated content including hate speech, violence, sexual content, and self-harm material. This skill integrates Azure AI Content Safety REST API to automatically analyze and flag inappropriate text and images with configurable severity thresholds.

支援: Claude Codex Code(CC)
📊 71 充足
1

下載技能 ZIP

2

在 Claude 中上傳

前往 設定 → 功能 → 技能 → 上傳技能

3

開啟並開始使用

測試它

正在使用「azure-ai-contentsafety-ts」。 Analyze a comment for hate speech and violence

預期結果:

Text flagged with severity 4 (Medium) for Hate category. Recommendation: Block or require human review before publication.

正在使用「azure-ai-contentsafety-ts」。 Check an uploaded profile image

預期結果:

Image analysis complete. All categories at severity 0 (Safe). Content approved for publication.

正在使用「azure-ai-contentsafety-ts」。 Moderate chat message with custom blocklist

預期結果:

Message blocked: Contains prohibited term "cheat-code-hack" from gaming-blocklist. Severity 6 (High) for Violence category also detected.

安全審計

低風險
v1 • 2/24/2026

Static analyzer produced 70 false positive findings by misidentifying markdown documentation as executable code. The SKILL.md file contains TypeScript code examples using markdown code fences (```), not shell backticks. The skill legitimately requires network access to Azure Content Safety APIs and environment variables for API credentials. No malicious patterns detected after manual review.

1
已掃描檔案
306
分析行數
4
發現項
1
審計總數
低風險問題 (2)
Hardcoded URL in Documentation
Example Azure endpoint URL shown in environment variable template. This is documentation showing the expected format, not an actual hardcoded credential.
Environment Variable Access for Credentials
Skill accesses process.env.CONTENT_SAFETY_KEY for API authentication. This is standard and safe practice for credential management.

風險因素

審計者: claude

品質評分

38
架構
100
可維護性
87
內容
50
社群
86
安全
91
規範符合性

你能建構什麼

Social Platform Content Moderation

Automatically scan user posts, comments, and uploaded images before publication to detect and block harmful content that violates community guidelines.

Educational Forum Safety

Protect students in online learning environments by filtering hate speech, bullying content, and self-harm discussions with appropriate severity thresholds.

E-commerce Review Filtering

Moderate product reviews and seller communications to maintain platform quality standards and prevent abusive or inappropriate content from appearing.

試試這些提示

Basic Text Analysis
Analyze this text for harmful content and tell me if it should be allowed: "[INSERT TEXT]"
Image Content Check
Check if this image contains harmful or inappropriate content that should be blocked from our platform: [PROVIDE IMAGE PATH OR URL]
Custom Blocklist Setup
Help me create a blocklist for my gaming platform that blocks cheating-related terms, hate speech, and harassment. Add these specific terms: [LIST TERMS]
Batch Content Moderation
I need to moderate these 50 user submissions with a maximum allowed severity of 2 (low). Flag anything above this threshold and check against my "gaming-community" blocklist: [LIST CONTENT]

最佳實務

  • Always use the isUnexpected() type guard to handle API errors gracefully and prevent crashes from unexpected responses
  • Set category-specific severity thresholds based on your community guidelines - hate speech may need stricter limits than general violence
  • Maintain audit logs of all moderation decisions with timestamps, severity scores, and action taken for compliance and appeals

避免

  • Do not block content solely based on severity 2 (Low) - this may result in excessive false positives and user frustration
  • Never store API keys directly in code - always use environment variables or Azure Key Vault for credential management
  • Avoid making moderation decisions on empty text or corrupted images - validate input before sending to the API

常見問題

What categories of harmful content does Azure Content Safety detect?
Azure Content Safety detects four main categories: Hate (discriminatory language targeting identity groups), Sexual (sexual content, nudity, pornography), Violence (physical harm, weapons, terrorism), and SelfHarm (self-injury, suicide, eating disorders).
How do severity levels work and what should I block?
Severity levels range from 0 (Safe) to 6 or 7 (High). Recommended actions: 0 = Allow, 2 = Review or allow with warning, 4 = Block or require human review, 6 = Block immediately. Configure thresholds based on your risk tolerance.
Can I add custom words or phrases to block?
Yes. Use blocklists to define custom prohibited terms specific to your domain. Create a blocklist with the PATCH endpoint, add items with the addOrUpdateBlocklistItems action, then reference blocklistNames during text analysis.
What image formats are supported for analysis?
Azure Content Safety supports common image formats including PNG, JPEG, BMP, and GIF. Images can be provided as base64-encoded content or as Azure Blob Storage URLs.
How do I authenticate with the Azure Content Safety API?
Two options: (1) API Key authentication using AzureKeyCredential with your CONTENT_SAFETY_KEY, or (2) Azure AD authentication using DefaultAzureCredential for managed identity or service principal scenarios.
What happens if the API is unavailable or returns an error?
Always wrap API calls with isUnexpected() type guard to detect errors. On failure, decide whether to block content defensively (fail-closed) or allow it temporarily (fail-open) based on your risk tolerance. Log all errors for investigation.

開發者詳情

檔案結構

📄 SKILL.md