Document Extraction Fixture Pack

Bundle of real PDF and TXT fixtures for extraction, layout parsing, OCR-style validation, protected-document handling, and damaged-file workflows.

10 Included Fixtures
3 Best For
document_extraction_fixture_pack.zip · 18.9 KB
Best For

Use Cases

  • Field extraction and fixed-layout parsing across clean, scan-style, and protected PDFs.
  • Text extraction and encoding validation using UTF-8, UTF-16, and minimal TXT fixtures.
  • Repeatable setup for OCR, parser, and document-extraction QA workflows.
Included Fixtures

Included Files

Filename Format Size Download
pdf_invoice_layout_sample.pdf
.pdf SHA256 45c10f35ba18...
PDF 774 B Download
pdf_form_like_sample.pdf
.pdf SHA256 6b5c49113a70...
PDF 773 B Download
pdf_scan_like_image_sample.pdf
.pdf SHA256 22a2cb26d64c...
PDF 3.7 KB Download
pdf_ocr_noise_sample.pdf
.pdf SHA256 19097c94fe1a...
PDF 7.9 KB Download
pdf_multi_column_report_sample.pdf
.pdf SHA256 6c5d36e07e3d...
PDF 3.3 KB Download
pdf_password_protected_sample.pdf
.pdf SHA256 37f22291ff8b...
PDF 3.2 KB Download
pdf_truncated_edge_case_sample.pdf
.pdf SHA256 537de4efe227...
PDF 701 B Download
txt_utf8_multilingual_sample.txt
.txt SHA256 1e219cd0bddf...
TXT 94 B Download
txt_utf16le_sample.txt
.txt SHA256 9033cba7c418...
TXT 176 B Download
txt_minimal_readme_sample.txt
.txt SHA256 1988d57016b2...
TXT 100 B Download
Related Strategy Pages

Related Pages

Best Format Guides

Use-Case Recommendations

How to Convert

Comparisons

Fixture Matrix

Related Matrices

Use the curated PDF matrix to move from this pack into the exact single-fixture variants behind it.

Open Primary Library

Browse Library

This pack is anchored to the PDF sample library and works best when paired with individual fixture downloads.

Open PDF Library