|
--- |
|
base_model: unsloth/qwen2-vl-7b-instruct-unsloth-bnb-4bit |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- qwen2_vl |
|
- trl |
|
- qlora |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- unsloth/LaTeX_OCR |
|
--- |
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** MMoshtaghi |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** unsloth/qwen2-vl-7b-instruct-unsloth-bnb-4bit |
|
- **Finetuned on dataset:** [unsloth/LaTeX_OCR](https://huggingface.co/datasets/unsloth/LaTeX_OCR) |
|
- **PEFT method :** [Quantized LoRA](https://huggingface.co/papers/2305.14314) |
|
|
|
## Quick start |
|
|
|
```python |
|
from datasets import load_dataset |
|
from unsloth import FastVisionModel |
|
|
|
model, tokenizer = FastVisionModel.from_pretrained( |
|
model_name = "MMoshtaghi/Qwen2-VL-7B-Instruct-LoRAAdpt-MathOCR", |
|
load_in_4bit = True, |
|
) |
|
FastVisionModel.for_inference(model) # Enable for inference! |
|
|
|
dataset = load_dataset("unsloth/LaTeX_OCR", split = "train") |
|
image = dataset[0]["image"] |
|
instruction = "Write the LaTeX representation for this image." |
|
|
|
messages = [ |
|
{"role": "user", "content": [ |
|
{"type": "image"}, |
|
{"type": "text", "text": instruction} |
|
]} |
|
] |
|
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True) |
|
inputs = tokenizer( |
|
image, |
|
input_text, |
|
add_special_tokens = False, |
|
return_tensors = "pt", |
|
).to("cuda") |
|
|
|
from transformers import TextStreamer |
|
text_streamer = TextStreamer(tokenizer, skip_prompt = True) |
|
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128, |
|
use_cache = True, temperature = 1.5, min_p = 0.1) |
|
``` |
|
|
|
### Framework versions |
|
|
|
- TRL: 0.13.0 |
|
- Transformers: 4.47.1 |
|
- Pytorch: 2.5.1+cu121 |
|
- Datasets: 3.2.0 |
|
- Tokenizers: 0.21.0 |
|
- Unsloth: 2025.1.5 |
|
|
|
## Training procedure |
|
(Log-in required!) |
|
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/open_ai/huggingface/runs/8juqyo5h) |
|
|
|
|
|
## Citations |
|
This VLM model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |