--- license: cc-by-4.0 language: - en - tr tags: - VLM - image2text - lm --- # TeLVE: Turkish efficient Language Vision Engine 🧿 [![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/) [![Models: v1.0](https://img.shields.io/badge/Models-v1.0%2c%20v1.0dep-blue)](https://huggingface.co/outsu/TeLVE) ## First Turkish VLM ever! TeLVE is the first Visual Language Model specifically designed for Turkish language understanding and image description generation. Built on Vision Transformer (ViT) and BERT pre-trained encoder architectures, it bridges the gap in Turkish visual-linguistic processing. No module named 'imagine' ![TeLVE logo]() ## Model Description TeLVE combines: - 🖼️ Vision Transformer (ViT-base-patch16-224) - 📝 Turkish BERT (dbmdz/bert-base-turkish-cased) - 🔄 Cross-attention mechanism for vision-language fusion ### Version Logs - **TeLVE v1.0**: Trained on Unsplash Lite dataset - **TeLVE v1.0dep**: Dataset enhanced with selective images from Pexels images, the encoder problem with letter "ü" was fixed. *(Deprecated, performance was decreased because of dataset addressing problem. Not recommended to use.)* ## Usage The model can be used in two ways: ### Inference (imagine.py) ```python # Generate captions for images python imagine.py ``` This script: - Loads a trained TeLVE model - Takes images from `images` directory - Generates Turkish captions for each image - Outputs the results to console ### Training (main.py) Users can train their own models with ViT and BERT encoders. ```python # Train a new model python main.py ``` This script: - Loads and preprocesses image-caption pairs - Initializes ViT and BERT encoders - Trains the combined model - Saves the model and tokenizer ## Performance Performance scores will be evaluated. ## Citation ```bibtex @software{telve2024, author = {Öğüt Su Karagün}, title = {TeLVE: Turkish efficient Language Vision Engine}, year = {2024}, url = {https://huggingface.co/outsu/TeLVE} } ``` ## License

TeLVE © 2024 by Öğüt Su Karagün is licensed under Creative Commons Attribution 4.0 International