Office Document Parsing Fixtures

DOCX and companion document fixtures for office parsing, text extraction, table handling, and document-ingestion workflows.

Why This Workflow Matters

  • Covers multi-section DOCX files, table-bearing documents, and office-style narrative content.
  • Use with PDF companions to compare office parsing against fixed-layout extraction outputs.
  • Anchored to a download pack so office-ingestion suites can start with one bundle.

Recommended Packs

Office Document Parsing Pack

Bundle of real DOCX and related document fixtures for office-document parsing, text extraction, and structured-content QA.

office_document_parsing_pack.zip · 12.1 KB

Fixture Matrices

DOCX Office Fixture Matrix

Choose DOCX fixtures for office-document parsing, section extraction, table handling, and office-ingestion workflows.

PDF Extraction Fixture Matrix

Use the PDF matrix to choose between text-heavy, layout-driven, form-like, and damaged fixtures for preview and extraction pipelines.

Suggested Fixtures

Filename Format Size Actions
docx_project_brief_sample.docx DOCX 2.6 KB
docx_meeting_notes_sample.docx DOCX 2.7 KB
docx_table_report_sample.docx DOCX 2.7 KB
docx_policy_manual_sample.docx DOCX 2.6 KB
pdf_multi_column_report_sample.pdf PDF 3.3 KB