|
--- |
|
language: |
|
- it |
|
base_model: |
|
- openbmb/MiniCPM-V-2_6 |
|
library_name: transformers |
|
tags: |
|
- vision |
|
- vqa-italian |
|
- visual-question-answering-italian |
|
--- |
|
|
|
|
|
<h1>A finetuned version of MiniCPM-V 2.6 on GQA-it: Italian Question Answering on Image Scene Graph</h1> |
|
|
|
|
|
# Usage |
|
Check out the GitHub repository for more insights and code: https://github.com/crux82/XXXXXX. You can also visit the original basic model repository for advanced usage: https://github.com/OpenBMB/MiniCPM-V. |
|
|
|
For more details about dataset please visit: https://github.com/crux82/gqa-it |
|
```python |
|
import torch |
|
from PIL import Image |
|
from transformers import AutoModel, AutoTokenizer,AutoProcessor |
|
|
|
model = AutoModel.from_pretrained('sag-uniroma2/MiniCPM-V-2_6-gqa-it-finetuned', trust_remote_code=True, |
|
attn_implementation='sdpa', torch_dtype=torch.bfloat16) |
|
model = model.eval().cuda() |
|
tokenizer = AutoTokenizer.from_pretrained('openbmb/MiniCPM-V-2_6', trust_remote_code=True) |
|
img="xx.jpg" |
|
image = Image.open(img).convert('RGB') |
|
|
|
question = "C'è un idrante sull'erba?" |
|
msgs = [{'role': 'user', 'content': [image,question]}] |
|
|
|
answer = model.chat( |
|
image=None, |
|
msgs=msgs, |
|
tokenizer=tokenizer |
|
) |
|
print(answer) |
|
|
|
``` |
|
|
|
# Citation |
|
``` |
|
TODO |
|
``` |