Edit model card

This is a GenerativeImage2Text model finetuned on non-text images extracted from documents (i.e.PDF). It is used to analyze the content of the image and produce a descriptive caption. It is part of a project to build a software solution capable of processing offline documents (PDFs, Word, PowerPoint, PPT, etc.) to detect WCAG accessibility issues.

Example document with non-text images: image/png Extracted Image: Alt text Generated caption: "Indication of correct signature"

Downloads last month
19
Safetensors
Model size
177M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Caraaaaa/text_image_captioning