スキル extract

📦

extract

Name: extract
Author: tavily-ai

低リスク ⚙️ 外部コマンド🌐 ネットワークアクセス📁 ファイルシステムへのアクセス🔑 環境変数

URLからWebコンテンツを抽出

こちらからも入手できます: pbakaus

このスキルは、Tavilyの抽出APIを使用して、特定のURLからクリーンなマークダウンまたはテキストコンテンツを抽出します。カスタムスクレイピングコードを書くことなく、リサーチ、ドキュメント取得、コンテンツ集約に最適です。

対応: Claude Codex Code(CC)

⚠️ 68 貧弱

スキルZIPをダウンロード

Claudeでアップロード

設定 → 機能 → スキル → スキルをアップロードへ移動

オンにして利用開始

テストする

「extract」を使用しています。 https://example.com/aboutからコンテンツを抽出

期待される結果:

## Exampleについて

Example.comへようこそ...

私たちのミッション

私たちは...の提供に努めています

「extract」を使用しています。 https://example.com/pricingとhttps://example.com/plansから価格に関する情報を抽出

期待される結果:

## 価格情報

### ベーシックプラン - 月額$9
- 機能A
- 機能B

### プロプラン - 月額$29
- ベーシックの全機能
- 優先サポート...

セキュリティ監査

低リスク

v1 • 2/18/2026

Static analysis detected 137 potential issues across external_commands, network, filesystem, and env_access categories. After semantic evaluation, all findings are FALSE POSITIVES - these patterns represent legitimate API extraction functionality. The skill uses standard shell commands (curl, jq) to communicate with Tavily's official API, accesses environment variables for API key authentication, and reads OAuth tokens from the standard MCP auth directory. No malicious behavior, data exfiltration, or command injection vulnerabilities were identified.

スキャンされたファイル

369

解析された行数

検出結果

総監査数

低リスクの問題 (4)

scripts/extract.sh:1-167 SKILL.md:13-201

Shell Command Execution Patterns

Static scanner flagged 62 instances of shell command execution (backticks, $() substitutions). These are FALSE POSITIVES - the skill uses standard Unix tools (curl, jq, base64) for legitimate API communication with Tavily's official service. No user input is injected into shell commands without validation.

scripts/extract.sh:4-152 SKILL.md:16-189

Network Request Patterns

Static scanner flagged 33 network access instances including hardcoded URLs. These are FALSE POSITIVES - the skill is designed to make HTTPS API calls to Tavily's official endpoints (api.tavily.com, mcp.tavily.com). Network access is core functionality for web content extraction.

scripts/extract.sh:65-153 SKILL.md:24-181

Environment Variable Access

Static scanner flagged 16 environment variable access instances for TAVILY_API_KEY. These are FALSE POSITIVES - the skill reads API keys from environment variables, which is the standard and secure method for providing credentials to API-based skills. The skill properly handles missing keys by initiating OAuth flow.

scripts/extract.sh:45-163 SKILL.md:13-20

Filesystem Access for OAuth Tokens

Static scanner flagged filesystem access to ~/.mcp-auth/ directory. This is a FALSE POSITIVE - the skill reads OAuth tokens from the standard MCP authentication directory. This is expected behavior for OAuth-based authentication and poses no security risk.

リスク要因

⚙️ 外部コマンド (62)

🌐 ネットワークアクセス (33)

📁 ファイルシステムへのアクセス (17)

scripts/extract.sh:45 scripts/extract.sh:17 scripts/extract.sh:26 scripts/extract.sh:32 scripts/extract.sh:50 scripts/extract.sh:60 scripts/extract.sh:98 scripts/extract.sh:98 scripts/extract.sh:115 scripts/extract.sh:116 scripts/extract.sh:128 scripts/extract.sh:134 scripts/extract.sh:163 SKILL.md:13 SKILL.md:20 SKILL.md:13 SKILL.md:20

🔑 環境変数 (16)

scripts/extract.sh:65 scripts/extract.sh:66 scripts/extract.sh:69 scripts/extract.sh:94 scripts/extract.sh:109 scripts/extract.sh:120 scripts/extract.sh:123 scripts/extract.sh:153 SKILL.md:24 SKILL.md:57 SKILL.md:69 SKILL.md:93 SKILL.md:137 SKILL.md:150 SKILL.md:167 SKILL.md:181

監査者: claude

品質スコア

アーキテクチャ

100

保守性

コンテンツ

コミュニティ

セキュリティ

仕様準拠

作れるもの

ドキュメントのリサーチ収集

複数のAPIリファレンスページからドキュメントコンテンツを抽出して、ローカルナレッジベースを構築

競合分析

市場調査のために、競合他社のウェブサイト、製品ページ、ブログ記事からコンテンツを抽出

コンテンツ集約

複数のニュースソースやブログから記事やコンテンツを単一のマークダウン形式で収集

これらのプロンプトを試す

基本的なURL抽出

このURLからコンテンツを抽出してください: https://example.com/article

複数URLの抽出

これらのURLからコンテンツを抽出してください: https://docs.example.com/api, https://docs.example.com/auth

クエリに焦点を当てた抽出

これらのURLから認証に関する情報を抽出してください: https://example.com/docs, https://example.com/api-reference。APIキーとOAuthに焦点を当ててください。

動的ページの高度な抽出

高度な抽出を使用して、このJavaScriptを多用するページからすべてのコンテンツを抽出してください: https://app.example.com/dashboard

ベストプラクティス

クエリパラメータを使用して、必要なものだけにコンテンツをフィルタリングしてください。特に大きなページから抽出する場合に有効です
基本抽出から始め、コンテンツが欠落しているか不完全な場合のみ高度なモードを使用してください
URLをトピックまたはカテゴリ別にバッチ処理して、結果を整理された関連性のある状態に保ってください

回避

1回のリクエストで20を超えるURLを抽出すると失敗します
クエリパラメータなしでchunks_per_sourceを使用するとエラーが返されます
レスポンスのfailed_resultsフィールドを確認しないと、抽出の失敗を見逃す可能性があります

よくある質問

Tavily APIキーは必要ですか？

はい、Tavily APIキーまたはOAuth認証用の既存のTavilyアカウントのいずれかが必要です。tavily.comでAPIキーを取得するか、アカウントにサインアップしてください。

一度に何個のURLを抽出できますか？

1回のリクエストで最大20URLまで抽出できます。より大きなバッチの場合は、複数のリクエストに分割してください。

基本抽出と高度な抽出の違いは何ですか？

基本抽出は高速で、静的HTMLページに適しています。高度な抽出は、JavaScriptレンダリングページ、複雑なレイアウト、構造化データを処理できますが、時間がかかります。

クエリパラメータはどのように機能しますか？

クエリパラメータは、抽出されたコンテンツチャンクを検索語への関連性で再ランク付けします。chunks_per_sourceと一緒に使用して、最も関連性の高いセクションを取得してください。

failed_resultsが返されるのはなぜですか？

URLに到達できない、ブロックされている、またはタイムアウトした場合に失敗結果が発生します。レスポンスのfailed_results配列で特定のエラー情報を確認してください。

パスワードで保護されたページからコンテンツを抽出できますか？

いいえ、このスキルは、公開アクセス可能な範囲を超えて、ログインや認証が必要なページからコンテンツを抽出することはできません。

開発者の詳細

作成者

tavily-ai

ライセンス

MIT

リポジトリ

https://github.com/tavily-ai/skills/tree/main/skills/tavily/extract/

参照

main

ファイル構成

📁 scripts/

📄 extract.sh

📄 SKILL.md