Page to MD Collection A dataset of image-text pairs sourced from research papers on arXiv, where each image is derived from a PDF page and paired with its corresponding OCR • 7 items • Updated about 1 month ago