-
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 67 -
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration
Paper • 2309.01131 • Published • 2 -
On the Hidden Mystery of OCR in Large Multimodal Models
Paper • 2305.07895 • Published • 1 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 192
Onur Savas
onursavas
AI & ML interests
None yet
Recent Activity
liked a Space about 18 hours ago
nvidia/LocateAnything liked a Space 6 months ago
microsoft/TRELLIS.2 liked a Space about 1 year ago
MohamedRashad/Nanonets-OCR