--- library_name: transformers base_model: microsoft/Florence-2-base-ft tags: - finetune - image-to-text - VQA - VLM language: - en --- # Model Details # Visual Question Answering Model This model is a fine-tuned version of `microsoft/Florence-2-base-ft` designed for Visual Question Answering (VQA). It has been optimized for tasks where the model interprets images and responds to questions about the visual content. --- ### Model Details - **Finetuned by:** prithivMLmods - **Model type:** Visual Question Answering (VQA) - **Language(s):** English (NLP component) - **License:** None specified - **Finetuned from model:** [microsoft/Florence-2-base-ft](https://huggingface.co/microsoft/Florence-2-base-ft) ### Usage This model can be used to perform VQA tasks, where it takes an image and a question about the image as input, and returns an answer based on the visual content.