์Šคํ‚ฌ azure-ai-formrecognizer-java
๐Ÿ“ฆ

azure-ai-formrecognizer-java

์•ˆ์ „

Extract Data from Documents with Azure Form Recognizer

Automate document processing by extracting text, tables, and structured data from forms, invoices, and receipts. Build Java applications that leverage Azure Document Intelligence prebuilt models or create custom document analyzers.

์ง€์›: Claude Codex Code(CC)
๐Ÿฅ‰ 74 ๋ธŒ๋ก ์ฆˆ
1

์Šคํ‚ฌ ZIP ๋‹ค์šด๋กœ๋“œ

2

Claude์—์„œ ์—…๋กœ๋“œ

์„ค์ • โ†’ ๊ธฐ๋Šฅ โ†’ ์Šคํ‚ฌ โ†’ ์Šคํ‚ฌ ์—…๋กœ๋“œ๋กœ ์ด๋™

3

ํ† ๊ธ€์„ ์ผœ๊ณ  ์‚ฌ์šฉ ์‹œ์ž‘

ํ…Œ์ŠคํŠธํ•ด ๋ณด๊ธฐ

"azure-ai-formrecognizer-java" ์‚ฌ์šฉ ์ค‘์ž…๋‹ˆ๋‹ค. Analyze receipt from restaurant image

์˜ˆ์ƒ ๊ฒฐ๊ณผ:

Extracted fields: Merchant='The Grand Bistro', TransactionDate=2024-01-15, Total=87.42, Subtotal=78.50, Tax=8.92. Line items: Pasta Carbonara ($24.00), Caesar Salad ($18.00), Tiramisu ($12.00), Wine ($24.50). Confidence scores above 0.95 for all fields.

"azure-ai-formrecognizer-java" ์‚ฌ์šฉ ์ค‘์ž…๋‹ˆ๋‹ค. Extract table from quarterly report PDF

์˜ˆ์ƒ ๊ฒฐ๊ณผ:

Table detected: 12 rows x 5 columns. Headers: Quarter, Revenue, Expenses, Profit, Growth. Data extracted with cell positions and confidence scores. Table structure preserved with row/column indices for spreadsheet export.

"azure-ai-formrecognizer-java" ์‚ฌ์šฉ ์ค‘์ž…๋‹ˆ๋‹ค. Process batch of mixed document types

์˜ˆ์ƒ ๊ฒฐ๊ณผ:

Document 1: Classified as Invoice (98% confidence) - Vendor: ABC Supply, Amount: $1,250.00. Document 2: Classified as Receipt (96% confidence) - Merchant: Office Depot, Total: $89.99. Document 3: Classified as Contract (94% confidence) - 8 pages detected.

๋ณด์•ˆ ๊ฐ์‚ฌ

์•ˆ์ „
v1 โ€ข 2/24/2026

All static analysis findings are false positives. The file is Markdown documentation containing Java SDK code examples. No executable code, command injection, or malicious patterns detected. External command detections confused markdown code fences with shell syntax. URL references are documentation examples for Azure endpoint configuration.

1
์Šค์บ”๋œ ํŒŒ์ผ
347
๋ถ„์„๋œ ์ค„ ์ˆ˜
0
๋ฐœ๊ฒฌ ์‚ฌํ•ญ
1
์ด ๊ฐ์‚ฌ ์ˆ˜
๋ณด์•ˆ ๋ฌธ์ œ๋ฅผ ์ฐพ์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค
๊ฐ์‚ฌ์ž: claude

ํ’ˆ์งˆ ์ ์ˆ˜

38
์•„ํ‚คํ…์ฒ˜
100
์œ ์ง€๋ณด์ˆ˜์„ฑ
87
์ฝ˜ํ…์ธ 
50
์ปค๋ฎค๋‹ˆํ‹ฐ
100
๋ณด์•ˆ
91
์‚ฌ์–‘ ์ค€์ˆ˜

๋งŒ๋“ค ์ˆ˜ ์žˆ๋Š” ๊ฒƒ

Automated Invoice Processing

Extract vendor names, invoice dates, line items, and totals from supplier invoices. Integrate with accounts payable systems to automate data entry and reduce manual processing time.

Receipt Data Capture

Build expense reporting applications that extract merchant names, transaction dates, and itemized purchases from receipt images. Enable mobile receipt scanning for employee reimbursements.

Custom Form Digitization

Create custom models for organization-specific forms like purchase orders, contracts, or survey responses. Train the model on sample documents and deploy automated extraction workflows.

์ด ํ”„๋กฌํ”„ํŠธ๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด์„ธ์š”

Quick Start: Analyze a Document
Create a Java program using Azure Document Intelligence SDK to extract text and tables from a PDF file. Use the prebuilt-layout model and print the results to console.
Receipt Analysis with Field Extraction
Write Java code to analyze a receipt image using the prebuilt-receipt model. Extract merchant name, transaction date, and total amount with confidence scores. Handle cases where fields may be missing.
Custom Model for Invoice Processing
Generate Java code to build a custom document model for invoice extraction. Include training from Azure Blob Storage, model composition from multiple invoice templates, and error handling for training failures.
Document Classification Pipeline
Create a complete Java solution that classifies incoming documents as invoices, receipts, or contracts using a custom classifier, then routes each type to the appropriate prebuilt model for detailed extraction. Include async polling and retry logic.

๋ชจ๋ฒ” ์‚ฌ๋ก€

  • Use DefaultAzureCredential for production deployments to leverage managed identities and avoid hardcoding credentials
  • Implement polling with appropriate timeouts for long-running analysis operations on large documents
  • Validate document format and size before submission to avoid unnecessary API calls and costs

ํ”ผํ•˜๊ธฐ

  • Do not hardcode API keys or endpoints in source code; use environment variables or Azure Key Vault
  • Avoid synchronous blocking calls in high-throughput scenarios; use async patterns for scalability
  • Do not process documents without validating file types; unsupported formats will fail analysis

์ž์ฃผ ๋ฌป๋Š” ์งˆ๋ฌธ

What document formats does Azure Form Recognizer support?
Azure Document Intelligence supports PDF, JPEG, PNG, BMP, and TIFF formats. PDF documents can be up to 500 pages. Maximum file size is 500 MB for PDF and 20 MB for images.
How accurate are the prebuilt models?
Prebuilt models typically achieve 95%+ accuracy on standard documents. Accuracy varies by document quality, language, and field type. Custom models trained on your specific documents can achieve higher accuracy for specialized forms.
What is the difference between prebuilt and custom models?
Prebuilt models work out-of-box for common document types like invoices and receipts. Custom models are trained on your specific document templates for fields not covered by prebuilt models, such as proprietary forms or specialized layouts.
How do I handle documents in multiple languages?
Azure Document Intelligence supports over 70 languages automatically. Prebuilt models detect language from document content. For custom models, include training samples in all target languages for best results.
What happens if the analysis operation fails?
Failed operations return an error with status code and message. Common errors include invalid document format, unsupported language, or service throttling. Implement retry logic with exponential backoff for transient failures.
Can I process documents stored in Azure Blob Storage directly?
Yes, you can pass a SAS URL to beginAnalyzeDocumentFromUrl instead of uploading file bytes. This is more efficient for large documents and reduces bandwidth usage. Ensure the SAS token has read permissions and adequate validity period.

๊ฐœ๋ฐœ์ž ์„ธ๋ถ€ ์ •๋ณด

์ž‘์„ฑ์ž

sickn33

๋ผ์ด์„ ์Šค

MIT

์ฐธ์กฐ

main

ํŒŒ์ผ ๊ตฌ์กฐ

๐Ÿ“„ SKILL.md