--- language: - it base_model: - openbmb/MiniCPM-V-2_6 library_name: transformers tags: - vision - vqa-italian - visual-question-answering-italian ---

Finetuned version of MiniCPM-V 2.6 on GQA-it

This is a fine-tuned version of MiniCPM-V 2.6 on GQA-it, designed for Italian Vision Question Answering # Usage Check out the GitHub repository for more insights and code: https://github.com/crux82/XXXXXX. You can also visit the original basic model repository for advanced usage: https://github.com/OpenBMB/MiniCPM-V. For more details about dataset please visit: https://github.com/crux82/gqa-it ```python import torch from PIL import Image from transformers import AutoModel, AutoTokenizer,AutoProcessor model = AutoModel.from_pretrained('sag-uniroma2/MiniCPM-V-2_6-gqa-it-finetuned', trust_remote_code=True, attn_implementation='sdpa', torch_dtype=torch.bfloat16) model = model.eval().cuda() tokenizer = AutoTokenizer.from_pretrained('openbmb/MiniCPM-V-2_6', trust_remote_code=True) img="n346247.jpg" image = Image.open(img).convert('RGB') question = "C'รจ un idrante sull'erba?" msgs = [{'role': 'user', 'content': [image,question]}] answer = model.chat( image=None, msgs=msgs, tokenizer=tokenizer ) print(answer) ``` # Citation ``` TODO ```