Spaces:
Running
Running
introduction updates
Browse files- introduction.md +4 -4
introduction.md
CHANGED
@@ -21,23 +21,23 @@ Thank you for this amazing opportunity, we hope you will like the results! :hear
|
|
21 |
|
22 |
In this demo, we present two tasks:
|
23 |
|
24 |
-
+
|
25 |
compute the similarity between this string of text with respect to a set of images. The webapp is going to display the images that
|
26 |
have the highest similarity with the text query.
|
27 |
|
28 |
<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/text_to_image.png" alt="drawing" width="95%"/>
|
29 |
|
30 |
-
+
|
31 |
is going to compute the similarity between the image and each label. The webapp is going to display a probability distribution over the captions.
|
32 |
|
33 |
<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/image_to_text.png" alt="drawing" width="95%"/>
|
34 |
|
35 |
-
+
|
36 |
to find where "something" (like a "cat") is an image. The location of the object is computed by masking different areas of the image and looking at how the similarity to the image description changes.
|
37 |
|
38 |
<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/gatto_cane.png" alt="drawing" width="95%"/>
|
39 |
|
40 |
-
+
|
41 |
different applications that can start from here.
|
42 |
|
43 |
# Novel Contributions
|
|
|
21 |
|
22 |
In this demo, we present two tasks:
|
23 |
|
24 |
+
+ **Text to Image**: This task is essentially an image retrieval task. The user is asked to input a string of text and CLIP is going to
|
25 |
compute the similarity between this string of text with respect to a set of images. The webapp is going to display the images that
|
26 |
have the highest similarity with the text query.
|
27 |
|
28 |
<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/text_to_image.png" alt="drawing" width="95%"/>
|
29 |
|
30 |
+
+ **Image to Text**: This task is essentially a zero-shot image classification task. The user is asked for an image and for a set of captions/labels and CLIP
|
31 |
is going to compute the similarity between the image and each label. The webapp is going to display a probability distribution over the captions.
|
32 |
|
33 |
<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/image_to_text.png" alt="drawing" width="95%"/>
|
34 |
|
35 |
+
+ **Localization**: This is one of ours **very cool** features and at the best of our knowledge, it is a novel contribution. We can use CLIP
|
36 |
to find where "something" (like a "cat") is an image. The location of the object is computed by masking different areas of the image and looking at how the similarity to the image description changes.
|
37 |
|
38 |
<img src="https://huggingface.co/spaces/clip-italian/clip-italian-demo/raw/main/static/img/gatto_cane.png" alt="drawing" width="95%"/>
|
39 |
|
40 |
+
+ **Examples & Applications**: This page showcases some interesting results we got from the model, we believe that there are
|
41 |
different applications that can start from here.
|
42 |
|
43 |
# Novel Contributions
|