📄

pdf

Name: pdf
Author: ArtemisAI

低风险 ⚡ 包含脚本📁 文件系统访问

處理 PDF 文件

也可从以下获取: sickn33,Azeem-2,92Bilal26,92Bilal26,anthropics,ZhanlinCui,AutumnsGrove,DYAI2025,K-Dense-AI,davila7,Cam10001110101,ComposioHQ

PDF 文件需要專門的工具來進行擷取、修改和表單填寫。此技能提供全面的 PDF 處理功能，包括使用 Python 庫進行文字擷取、表格偵測、文件合併、拆分和表單欄位操作。

支持: Claude Codex Code(CC)

🥉 72 青铜

下载技能 ZIP

在 Claude 中上传

前往设置 → 功能 → 技能 → 上传技能

开启并开始使用

测试它

正在使用“pdf”。 Extract text from report.pdf

预期结果:

Extracted 15 pages from report.pdf
Saved text to report.txt (45,230 characters)
Found 3 tables on pages 5, 8, and 12

正在使用“pdf”。 Merge all invoices into one file

预期结果:

Combined 12 PDF files into merged_invoices.pdf
Total pages: 48
Used pypdf for document merging

正在使用“pdf”。 Fill out the application form

预期结果:

Identified 8 fillable form fields
Filled all fields with provided data
Saved to completed_application.pdf

安全审计

低风险

v5 • 1/16/2026

This is a legitimate PDF manipulation skill containing 8 Python scripts for document processing. The 227 static findings are false positives: documentation examples showing command syntax are not actual shell execution; legitimate file I/O uses standard PDF libraries (pypdf, pdfplumber); cryptographic references are for PDF password protection; and flagged keywords like 'command' appear only in documentation context. No network calls, no command injection, no credential handling.

已扫描文件

2,233

分析行数

发现项

审计总数

低风险问题 (1)

scripts/fill_fillable_fields.py:12

User-specified file paths accessed

Scripts accept file paths as command-line arguments for PDF processing. Expected behavior for document tools.

风险因素

⚡ 包含脚本 (3)

scripts/fill_fillable_fields.py:1-115 scripts/check_bounding_boxes.py:1-71 scripts/extract_form_field_info.py:1-153

📁 文件系统访问 (2)

scripts/fill_fillable_fields.py:12-56 scripts/extract_form_field_info.py:140-145

审计者: claude 查看审计历史 →

质量评分

架构

100

可维护性

内容

社区

安全

规范符合性

你能构建什么

自動化表單填寫

使用擷取的欄位資訊和驗證檢查，以程式方式填寫 PDF 表單

擷取結構化資料

從 PDF 報告中提取表格和文字，以進行試算表分析和資料處理

管理文件集合

合併、拆分和重新組織 PDF 文件，以實現高效的檔案管理流程

试试这些提示

擷取 PDF 文字

Extract all text from document.pdf and save it to a text file

合併 PDF

Merge report1.pdf, report2.pdf, and report3.pdf into quarterly_report.pdf

填寫 PDF 表單

Fill in application_form.pdf with the following data: name=John Smith, email=john@example.com, and save the completed form

批次處理文件

Extract tables from all PDFs in the /invoices directory and save each to a separate CSV file

最佳实践

在處理前驗證 PDF 文件，以妥善處理損壞或加密的文件
填寫不可填寫的 PDF 表單時使用邊界框驗證以確保準確放置
分塊處理大型 PDF 以有效管理記憶體使用

避免

嘗試處理需要密碼的 PDF 文件而未先取得密碼
填寫不可填寫的 PDF 表單時跳過邊界框驗證
處理極大型 PDF 而未進行分塊或記憶體管理

常见问题

需要哪些 Python 庫？

使用 pip 安裝 pypdf、pdfplumber、reportlab 和 pdf2image。OCR 功能需要 pytesseract。

PDF 處理的大小限制是什麼？

效能取決於可用的記憶體。非常大的文件應該分段處理。

此技能可以與其他工具整合嗎？

可以。Python 腳本可以從任何 AI 工具呼叫，或整合到更大的文件處理工作流程中。

PDF 中的資料會被儲存或傳輸嗎？

不會。所有處理都在本地進行。檔案會從您指定的路徑讀取和寫入。

為什麼表單填寫失敗？

常見原因包括：PDF 已加密、表單欄位未被正確偵測，或邊界框重疊。

這與線上 PDF 工具有何不同？

此技能在本地執行以保護隱私，可以處理批次操作，並可在 AI 工作流程中自動化。

开发者详情

作者

ArtemisAI

许可证

Proprietary. LICENSE.txt has complete terms

仓库

https://github.com/ArtemisAI/code-execution-with-MCP/tree/main/skills/document-skills/pdf

引用

main

文件结构

📁 scripts/

📄 check_bounding_boxes_test.py

📄 check_bounding_boxes.py

📄 check_fillable_fields.py

📄 convert_pdf_to_images.py

📄 create_validation_image.py

📄 extract_form_field_info.py

📄 fill_fillable_fields.py

📄 fill_pdf_form_with_annotations.py

📄 forms.md

📄 LICENSE.txt

📄 reference.md

📄 SKILL.md