Pack di fixture per estrazione documenti

Pack di fixture PDF e TXT reali per estrazione, analisi layout, validazione OCR, documenti protetti e file corrotti.

Download the Pack

document_extraction_fixture_pack.zip · 18.9 KB

Best For

  • Estrazione campi e analisi layout su PDF puliti, scansioni e documenti protetti.
  • Estrazione testo e validazione encoding con file TXT UTF-8, UTF-16 e minimali.
  • Setup ripetibile per OCR, parser e QA documentale.

Included Fixtures

Nome file Formato Dimensione Scarica
pdf_invoice_layout_sample.pdf PDF 774 B Scarica
pdf_form_like_sample.pdf PDF 773 B Scarica
pdf_scan_like_image_sample.pdf PDF 3.7 KB Scarica
pdf_ocr_noise_sample.pdf PDF 7.9 KB Scarica
pdf_multi_column_report_sample.pdf PDF 3.3 KB Scarica
pdf_password_protected_sample.pdf PDF 3.2 KB Scarica
pdf_truncated_edge_case_sample.pdf PDF 701 B Scarica
txt_utf8_multilingual_sample.txt TXT 94 B Scarica
txt_utf16le_sample.txt TXT 176 B Scarica
txt_minimal_readme_sample.txt TXT 100 B Scarica

Matrice di fixture

Use the curated PDF matrix to move from this pack into the exact single-fixture variants behind it.

Open Primary Library

This pack is anchored to the PDF sample library and works best when paired with individual fixture downloads.

Open PDF Library