merve HF staff commited on
Commit
bf00898
1 Parent(s): 7539d95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -2
README.md CHANGED
@@ -15,9 +15,31 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # paligemma_vqav2
17
 
18
- This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
19
 
20
- ## Training procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ### Training hyperparameters
23
 
 
15
 
16
  # paligemma_vqav2
17
 
18
+ This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on a small chunk of vq_av2 dataset.
19
 
20
+ ## How to Use
21
+
22
+ Below is the code to use this model. Also see [inference notebook](https://colab.research.google.com/drive/100IQcvMvGm9y--oelbLfI__eHCoz5Ser?usp=sharing).
23
+
24
+ ```
25
+ from transformers import AutoProcessor, PaliGemmaForConditionalGeneration
26
+ from PIL import Image
27
+ import requests
28
+
29
+ model_id = "merve/paligemma_vqav2"
30
+ model = PaliGemmaForConditionalGeneration.from_pretrained(model_id)
31
+ processor = AutoProcessor.from_pretrained("google/paligemma-3b-pt-224")
32
+
33
+ prompt = "What is behind the cat?"
34
+ image_file = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cat.png?download=true"
35
+ raw_image = Image.open(requests.get(image_file, stream=True).raw)
36
+
37
+ inputs = processor(prompt, raw_image.convert("RGB"), return_tensors="pt")
38
+ output = model.generate(**inputs, max_new_tokens=20)
39
+
40
+ print(processor.decode(output[0], skip_special_tokens=True)[len(prompt):])
41
+ # gramophone
42
+ ```
43
 
44
  ### Training hyperparameters
45