olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 3 items • Updated 3 days ago • 96
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 8 items • Updated 20 days ago • 398
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding Paper • 2312.04461 • Published Dec 7, 2023 • 62
Document Processing Collection Any model or dataset dealing with documentary-type objects (layout detection, VQA, OCR, etc.) • 9 items • Updated Nov 14, 2024 • 3
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 4 days ago • 85
Evaluation Datasets Collection Collection of Romanian datasets used for evaluation • 8 items • Updated Oct 11, 2024 • 1
SFT Datasets Collection Collection of Romanian datasets used for supervised finetuning • 9 items • Updated Oct 11, 2024 • 1
MultiLegalPile Models Collection A 689GB Multilingual Legal Corpus • 33 items • Updated Oct 23, 2023 • 1