laicsiifes
/

swin-distilbertimbau

vision-encoder-decoder

image-text-to-text

Inference Endpoints

Model card Files Files and versions Community

gabrielmotablima commited on Sep 2, 2024

Commit

28d2cd7

·

verified ·

1 Parent(s): a762645

update readme

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -14,9 +14,9 @@ base_model:
 pipeline_tag: text-generation
 ---
-# 🎉 Swin-DistilBERTimbau
-**Swin-DistilBERTimbau** model trained on [**Flickr30K Portuguese**](https://huggingface.co/datasets/laicsiifes/flickr30k-pt-br) (translated version using Google Translator API)
 at resolution 224x224 and max sequence length of 512 tokens.
@@ -29,6 +29,9 @@ The encoder checkpoints come from Swin Trasnformer version pre-trained on ImageN
 The code used for training and evaluation is available at: https://github.com/laicsiifes/ved-transformer-caption-ptbr. In this work, Swin-DistilBERTimbau
 was trained together with its buddy [Swin-GPorTuguese](https://huggingface.co/laicsiifes/swin-gpt2-flickr30k-pt-br).
 ## 🧑‍💻 How to Get Started with the Model
 Use the code below to get started with the model.

 pipeline_tag: text-generation
 ---
+# 🎉 Swin-DistilBERTimbau for Image Captioning
+Swin-DistilBERTimbau model trained for image captioning on [Flickr30K Portuguese](https://huggingface.co/datasets/laicsiifes/flickr30k-pt-br) (translated version using Google Translator API)
 at resolution 224x224 and max sequence length of 512 tokens.
 The code used for training and evaluation is available at: https://github.com/laicsiifes/ved-transformer-caption-ptbr. In this work, Swin-DistilBERTimbau
 was trained together with its buddy [Swin-GPorTuguese](https://huggingface.co/laicsiifes/swin-gpt2-flickr30k-pt-br).
+Other models evaluated didn't achieve performance as high as Swin-DistilBERTimbau and Swin-GPorTuguese, namely: DeiT-BERTimbau,
+DeiT-DistilBERTimbau, DeiT-GPorTuguese, Swin-BERTimbau, ViT-BERTimbau, ViT-DistilBERTimbau and ViT-GPorTuguese.
 ## 🧑‍💻 How to Get Started with the Model
 Use the code below to get started with the model.