المهارات pdf-analyze
📄

pdf-analyze

مخاطر منخفضة

Process PDF documents for extraction and form filling

متاح أيضًا من: 21pounder

PDF documents often contain important data that is difficult to access programmatically. This skill provides Claude with comprehensive tools to extract text and tables, fill forms, merge documents, and convert PDFs to images for analysis.

يدعم: Claude Codex Code(CC)
📊 71 كافٍ
1

تنزيل ZIP المهارة

2

رفع في Claude

اذهب إلى Settings → Capabilities → Skills → Upload skill

3

فعّل وابدأ الاستخدام

اختبرها

استخدام "pdf-analyze". Extract text from report.pdf and list all tables found

النتيجة المتوقعة:

  • Extracted 24 pages from report.pdf
  • Found 5 tables:
  • - Table 1: 'Revenue by Region' (page 3)
  • - Table 2: 'Q4 Performance Metrics' (page 7)
  • - Table 3: 'Customer Demographics' (page 12)
  • - Table 4: 'Year-over-Year Growth' (page 18)
  • - Table 5: 'Projected 2025 Targets' (page 22)
  • Saved extracted_text.txt (45 KB) and tables to tables_20250110.xlsx

التدقيق الأمني

مخاطر منخفضة
v3 • 1/10/2026

This is a legitimate PDF processing toolkit containing utility scripts for extracting text, filling forms, and manipulating documents. All code uses standard Python PDF libraries with no network access, no credential theft patterns, and no obfuscation. The skill's behavior aligns with its stated purpose.

11
الملفات التي تم فحصها
1,492
الأسطر التي تم تحليلها
0
النتائج
3
إجمالي عمليات التدقيق
لا توجد مشكلات أمنية
تم تدقيقه بواسطة: claude عرض سجل التدقيق →

درجة الجودة

59
الهندسة المعمارية
100
قابلية الصيانة
83
المحتوى
22
المجتمع
90
الأمان
78
الامتثال للمواصفات

ماذا يمكنك بناءه

Extract tables from reports

Pull structured data from financial reports, research papers, and statistical documents into CSV or Excel format.

Automate form completion

Fill out PDF forms programmatically with validated data for applications, surveys, and official documents.

Build PDF processing workflows

Create document processing pipelines that merge, split, and transform PDFs for applications and services.

جرّب هذه الموجهات

Extract PDF text
Extract all text from document.pdf using pdfplumber and save it to extracted_text.txt
List form fields
Check if application_form.pdf has fillable form fields, and if so, list all field names and types
Extract tables
Extract all tables from quarterly_report.pdf and save them to an Excel file with one sheet per table
Fill PDF form
Fill in the following fields in application_form.pdf using data from field_values.json and save to completed_form.pdf

أفضل الممارسات

  • Validate form field values before submission to catch errors early
  • Convert PDF to images first when working with non-fillable forms to visually verify annotation placement
  • Use the bounding box validation script to ensure annotations do not overlap or obscure existing content

تجنب

  • Skipping the form field validation step before filling PDFs
  • Not converting non-fillable PDFs to images for visual analysis first
  • Using hardcoded file paths instead of parameters for reusability

الأسئلة المتكررة

Which Python libraries does this skill use?
Primary libraries are pypdf for basic operations, pdfplumber for text and table extraction, and reportlab for creating new PDFs.
What are the system requirements?
Requires Python 3.8+ with pip install of pypdf, pdfplumber, reportlab, pdf2image, and PIL. Poppler must be installed for PDF to image conversion.
How do I fill a scanned PDF that is not fillable?
Use the non-fillable form workflow: convert PDF to images, manually determine text entry locations, create fields.json with bounding boxes, then use fill_pdf_form_with_annotations.py.
Is my data safe when processing PDFs?
Yes. All processing is local using Python libraries. No data is sent to external servers. Files are only read from and written to paths you specify.
Why does my filled PDF show annotations in the wrong position?
This usually indicates incorrect coordinate transformation. PDF coordinates start from bottom-left while image coordinates start from top-left. Verify your bounding box conversion logic.
How is this different from using pdf-lib in JavaScript?
The Python tools provide more mature text extraction and table parsing. pdf-lib is better suited for browser environments or Node.js projects that need to create or modify PDFs client-side.