DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 181
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11, 2024 • 76
UDOP Collection UDOP is a general multimodal model for document AI • 4 items • Updated 5 days ago • 23
PDF Document / OCR Datasets Collection Document datasets with .pdf files that are usable with pixparse libraries and tools. • 2 items • Updated Mar 30, 2024 • 47
Datasets - Documents - OCR - Image with Text from Textract Collection 1 item • Updated Apr 5, 2024 • 1
Table Transformer Collection The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. • 5 items • Updated 5 days ago • 20
LayoutLM Collection The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA. • 5 items • Updated 5 days ago • 14