Louisnguyen
/

llava-1.6-7b-hf-final

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 # llava-1.6-7b-hf-final
-This model is a fine-tuned version of [llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf) on an unknown dataset.
 ## Model description
@@ -32,6 +32,87 @@ More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -46,7 +127,7 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions
@@ -54,4 +135,4 @@ The following hyperparameters were used during training:
 - Transformers 4.43.3
 - Pytorch 2.2.1+cu121
 - Datasets 2.20.0
-- Tokenizers 0.19.1

 # llava-1.6-7b-hf-final
+This model is a fine-tuned version of [llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf) on an derek-thomas/ScienceQA.
 ## Model description
 ## Training procedure
+## Chat Template
+```python
+CHAT_TEMPLATE='''<<SYS>>
+A chat between an user and an artificial intelligence assistant about Science Question Answering. The assistant gives helpful, detailed, and polite answers to the user's questions.
+Based on the image, question and hint, please choose one of the given choices that answer the question.
+Give yourself room to think by extracting the image, question and hint before choosing the choice.
+Don't return the thinking, only return the highest accuracy choice.
+Make sure your answers are as correct as possible.
+<</SYS>>
+{% for tag, content in messages.items() %}
+{% if tag == 'real_question' %}
+Now use the following image and question to choose the choice:
+{% for message in content %}
+{% if message['role'] == 'user' %}[INST] USER: {% else %}ASSISTANT: {% endif %}
+{% for item in message['content'] %}
+{% if item['type'] == 'text_question' %}
+Question: {{ item['question'] }}
+{% elif item['type'] == 'text_hint' %}
+Hint: {{ item['hint'] }}
+{% elif item['type'] == 'text_choice' %}
+Choices: {{ item['choice'] }} [/INST]
+{% elif item['type'] == 'text_solution' %}
+Solution: {{ item['solution'] }}
+{% elif item['type'] == 'text_answer' %}
+Answer: {{ item['answer'] }}{% elif item['type'] == 'image' %}<image>
+{% endif %}
+{% endfor %}
+{% if message['role'] == 'user' %}
+{% else %}
+{{eos_token}}
+{% endif %}{% endfor %}{% endif %}
+{% endfor %}'''
+```
+## How to use
+```python
+from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
+import torch
+from PIL import Image
+import requests
+model_id = "Louisnguyen/llava-1.6-7b-hf-final"
+quantization_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+)
+model = LlavaNextForConditionalGeneration.from_pretrained(model_id,
+                                                      quantization_config=quantization_config,
+                                                      torch_dtype=torch.float16)
+model.to("cuda:0")
+processor = LlavaNextProcessor.from_pretrained(model_id)
+image = example["image"]
+question = example["question"]
+choices = example["choices"]
+hint = example["hint"]
+messages_answer = {
+        "real_question": [
+            {
+                "role": "user",
+                "content": [
+                    {"type": "image"},
+                    {"type": "text_question", "question": question},
+                    {"type": "text_hint", "hint": hint},
+                    {"type": "text_choice", "choice": ' or '.join(choices)},
+                ]
+            }
+        ]
+    }
+# Apply the chat template to format the messages for answer generation
+text_answer = processor.tokenizer.apply_chat_template(messages_answer, tokenize=False, add_generation_prompt=True)
+# Prepare the inputs for the model to generate the answer
+inputs_answer = processor(text=[text_answer.strip()], images=image, return_tensors="pt", padding=True).to('cuda')
+# Generate text using the model for the answer
+generated_ids_answer = model.generate(**inputs_answer, max_new_tokens=1024, pad_token_id=tokenizer.eos_token_id)
+# Decode the generated text for the answer
+generated_texts_answer = processor.batch_decode(generated_ids_answer[:, inputs_answer["input_ids"].size(1):], skip_special_tokens=True)
+print(generated_texts_answer)
+```
 ### Training hyperparameters
 The following hyperparameters were used during training:
 ### Training results
+Accuracy ~80%
 ### Framework versions
 - Transformers 4.43.3
 - Pytorch 2.2.1+cu121
 - Datasets 2.20.0
+- Tokenizers 0.19.1