license: apache-2.0 | |
language: | |
- fa | |
library_name: hezar | |
tags: | |
- hezar | |
A vision encoder decoder model initialized from `hezarai/roberta-base-fa` and `google/vit-base-patch16-224` weights. | |
**This model cannot perform image-to-text inference out of the box without finetuning.** |