--- base_model: unsloth/qwen2-vl-7b-instruct-unsloth-bnb-4bit tags: - text-generation-inference - transformers - unsloth - qwen2_vl - trl - qlora license: apache-2.0 language: - en datasets: - unsloth/LaTeX_OCR --- # Uploaded model - **Developed by:** MMoshtaghi - **License:** apache-2.0 - **Finetuned from model :** unsloth/qwen2-vl-7b-instruct-unsloth-bnb-4bit - **Finetuned on dataset:** [unsloth/LaTeX_OCR](https://huggingface.co/datasets/unsloth/LaTeX_OCR) - **PEFT method :** [Quantized LoRA](https://huggingface.co/papers/2305.14314) ## Quick start ```python from datasets import load_dataset from unsloth import FastVisionModel model, tokenizer = FastVisionModel.from_pretrained( model_name = "MMoshtaghi/Qwen2-VL-7B-Instruct-LoRAAdpt-MathOCR", load_in_4bit = True, ) FastVisionModel.for_inference(model) # Enable for inference! dataset = load_dataset("unsloth/LaTeX_OCR", split = "train") image = dataset[0]["image"] instruction = "Write the LaTeX representation for this image." messages = [ {"role": "user", "content": [ {"type": "image"}, {"type": "text", "text": instruction} ]} ] input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True) inputs = tokenizer( image, input_text, add_special_tokens = False, return_tensors = "pt", ).to("cuda") from transformers import TextStreamer text_streamer = TextStreamer(tokenizer, skip_prompt = True) _ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128, use_cache = True, temperature = 1.5, min_p = 0.1) ``` ### Framework versions - TRL: 0.13.0 - Transformers: 4.47.1 - Pytorch: 2.5.1+cu121 - Datasets: 3.2.0 - Tokenizers: 0.21.0 - Unsloth: 2025.1.5 ## Training procedure (Log-in required!) [Visualize in Weights & Biases](https://wandb.ai/open_ai/huggingface/runs/8juqyo5h) ## Citations This VLM model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.