Matriz de fixtures para extracao PDF
Use a matriz PDF para escolher entre fixtures ricos em texto, layout fixo, tipo formulario ou danificados em pipelines de preview e extracao.
10
Linhas de fixtures
3
Como usar esta matriz
Use the matrix when the validation target is a set of variants rather than one canonical sample.
Como usar esta matriz
Coverage
- Cobre PDFs de uma pagina, multipagina, layout complexo e arquivos danificados.
- Feita para preview, extracao de texto, mapeamento de campos e caminhos de erro de parser.
- Util para faturas, relatorios e fluxos documentais em que o layout importa.
Linhas de fixtures
Available Variants
| Variante | Perfil | Foco do teste | Arquivo | Tamanho | Baixar |
|---|---|---|---|---|---|
|
Single-Page Text
Best default sanity check for renderers and PDF text extraction.
|
Valid baseline | Simple rendering and extraction |
pdf_single_page_text_sample.pdf
|
725 B | Baixar |
|
Multi-Page Report
Useful for multi-page previews, extraction batching, and document splitting.
|
Valid document | Pagination and page count |
pdf_multi_page_report_sample.pdf
|
1.3 KB | Baixar |
|
Invoice Layout
Targets invoice parsers and structured extraction pipelines.
|
Layout-driven fixture | Field extraction from fixed layouts |
pdf_invoice_layout_sample.pdf
|
774 B | Baixar |
|
Scan-Style PDF
Useful for pipelines that distinguish text PDFs from scan-like pages.
|
Image-heavy fixture | OCR-style extraction |
pdf_scan_like_image_sample.pdf
|
3.7 KB | Baixar |
|
OCR-Noise PDF
Targets extraction robustness when scan quality or contrast is poor.
|
Image-heavy edge | Noisy OCR fallback |
pdf_ocr_noise_sample.pdf
|
7.9 KB | Baixar |
|
Form-Like PDF
Useful for OCR-adjacent field mapping and fixed-position extraction logic.
|
Structured layout | Form field and box detection |
pdf_form_like_sample.pdf
|
773 B | Baixar |
|
Landscape Report
Targets preview rotation, table extraction, and page-fit UI handling.
|
Orientation variant | Wide-table rendering |
pdf_landscape_report_sample.pdf
|
743 B | Baixar |
|
Multi-Column Report
Useful for column segmentation and reading-order extraction tests.
|
Layout complexity | Column-aware reading order |
pdf_multi_column_report_sample.pdf
|
3.3 KB | Baixar |
|
Password-Protected PDF
Use password `samplefile` for protected-document handling and UX checks.
|
Protected document | Unlock flow and restricted parsing |
pdf_password_protected_sample.pdf
|
3.2 KB | Baixar |
|
Truncated PDF
Good for parser failures, preview fallback, and corrupt-download handling.
|
Broken fixture | Damaged file recovery |
pdf_truncated_edge_case_sample.pdf
|
701 B | Baixar |
Paginas de estrategia relacionadas
Related Packs and Workflows
Pacotes relacionados
Pacote de fixtures para extracao de documentos
Fluxos relacionados
Fixtures para validacao de upload
Abrir fluxoFixtures para regressao de parsers
Abrir fluxoFixtures para extracao de documentos
Abrir fluxoPaginas de estrategia relacionadas
Related Pages
Guias de melhor formato
Guias por caso de uso
Melhor formato para arquivo documental de longo prazo
Melhor formato para edicao colaborativa de documentos
Guias de conversao
How to Convert DOCX to PDF
How to Convert EPUB to PDF
How to Convert PDF to DOCX
How to Convert PDF to EPUB