Louisnguyen
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
16 |
|
17 |
# llava-1.6-7b-hf-final
|
18 |
|
19 |
-
This model is a fine-tuned version of [llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf) on an
|
20 |
|
21 |
## Model description
|
22 |
|
@@ -32,6 +32,87 @@ More information needed
|
|
32 |
|
33 |
## Training procedure
|
34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
### Training hyperparameters
|
36 |
|
37 |
The following hyperparameters were used during training:
|
@@ -46,7 +127,7 @@ The following hyperparameters were used during training:
|
|
46 |
|
47 |
### Training results
|
48 |
|
49 |
-
|
50 |
|
51 |
### Framework versions
|
52 |
|
@@ -54,4 +135,4 @@ The following hyperparameters were used during training:
|
|
54 |
- Transformers 4.43.3
|
55 |
- Pytorch 2.2.1+cu121
|
56 |
- Datasets 2.20.0
|
57 |
-
- Tokenizers 0.19.1
|
|
|
16 |
|
17 |
# llava-1.6-7b-hf-final
|
18 |
|
19 |
+
This model is a fine-tuned version of [llava-hf/llava-v1.6-mistral-7b-hf](https://huggingface.co/llava-hf/llava-v1.6-mistral-7b-hf) on an derek-thomas/ScienceQA.
|
20 |
|
21 |
## Model description
|
22 |
|
|
|
32 |
|
33 |
## Training procedure
|
34 |
|
35 |
+
## Chat Template
|
36 |
+
```python
|
37 |
+
CHAT_TEMPLATE='''<<SYS>>
|
38 |
+
A chat between an user and an artificial intelligence assistant about Science Question Answering. The assistant gives helpful, detailed, and polite answers to the user's questions.
|
39 |
+
Based on the image, question and hint, please choose one of the given choices that answer the question.
|
40 |
+
Give yourself room to think by extracting the image, question and hint before choosing the choice.
|
41 |
+
Don't return the thinking, only return the highest accuracy choice.
|
42 |
+
Make sure your answers are as correct as possible.
|
43 |
+
<</SYS>>
|
44 |
+
{% for tag, content in messages.items() %}
|
45 |
+
{% if tag == 'real_question' %}
|
46 |
+
Now use the following image and question to choose the choice:
|
47 |
+
{% for message in content %}
|
48 |
+
{% if message['role'] == 'user' %}[INST] USER: {% else %}ASSISTANT: {% endif %}
|
49 |
+
{% for item in message['content'] %}
|
50 |
+
{% if item['type'] == 'text_question' %}
|
51 |
+
Question: {{ item['question'] }}
|
52 |
+
{% elif item['type'] == 'text_hint' %}
|
53 |
+
Hint: {{ item['hint'] }}
|
54 |
+
{% elif item['type'] == 'text_choice' %}
|
55 |
+
Choices: {{ item['choice'] }} [/INST]
|
56 |
+
{% elif item['type'] == 'text_solution' %}
|
57 |
+
Solution: {{ item['solution'] }}
|
58 |
+
{% elif item['type'] == 'text_answer' %}
|
59 |
+
Answer: {{ item['answer'] }}{% elif item['type'] == 'image' %}<image>
|
60 |
+
{% endif %}
|
61 |
+
{% endfor %}
|
62 |
+
{% if message['role'] == 'user' %}
|
63 |
+
{% else %}
|
64 |
+
{{eos_token}}
|
65 |
+
{% endif %}{% endfor %}{% endif %}
|
66 |
+
{% endfor %}'''
|
67 |
+
```
|
68 |
+
|
69 |
+
## How to use
|
70 |
+
```python
|
71 |
+
from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
|
72 |
+
import torch
|
73 |
+
from PIL import Image
|
74 |
+
import requests
|
75 |
+
|
76 |
+
|
77 |
+
model_id = "Louisnguyen/llava-1.6-7b-hf-final"
|
78 |
+
quantization_config = BitsAndBytesConfig(
|
79 |
+
load_in_4bit=True,
|
80 |
+
)
|
81 |
+
model = LlavaNextForConditionalGeneration.from_pretrained(model_id,
|
82 |
+
quantization_config=quantization_config,
|
83 |
+
torch_dtype=torch.float16)
|
84 |
+
model.to("cuda:0")
|
85 |
+
processor = LlavaNextProcessor.from_pretrained(model_id)
|
86 |
+
|
87 |
+
image = example["image"]
|
88 |
+
question = example["question"]
|
89 |
+
choices = example["choices"]
|
90 |
+
hint = example["hint"]
|
91 |
+
|
92 |
+
messages_answer = {
|
93 |
+
"real_question": [
|
94 |
+
{
|
95 |
+
"role": "user",
|
96 |
+
"content": [
|
97 |
+
{"type": "image"},
|
98 |
+
{"type": "text_question", "question": question},
|
99 |
+
{"type": "text_hint", "hint": hint},
|
100 |
+
{"type": "text_choice", "choice": ' or '.join(choices)},
|
101 |
+
]
|
102 |
+
}
|
103 |
+
]
|
104 |
+
}
|
105 |
+
# Apply the chat template to format the messages for answer generation
|
106 |
+
text_answer = processor.tokenizer.apply_chat_template(messages_answer, tokenize=False, add_generation_prompt=True)
|
107 |
+
# Prepare the inputs for the model to generate the answer
|
108 |
+
inputs_answer = processor(text=[text_answer.strip()], images=image, return_tensors="pt", padding=True).to('cuda')
|
109 |
+
# Generate text using the model for the answer
|
110 |
+
generated_ids_answer = model.generate(**inputs_answer, max_new_tokens=1024, pad_token_id=tokenizer.eos_token_id)
|
111 |
+
# Decode the generated text for the answer
|
112 |
+
generated_texts_answer = processor.batch_decode(generated_ids_answer[:, inputs_answer["input_ids"].size(1):], skip_special_tokens=True)
|
113 |
+
print(generated_texts_answer)
|
114 |
+
```
|
115 |
+
|
116 |
### Training hyperparameters
|
117 |
|
118 |
The following hyperparameters were used during training:
|
|
|
127 |
|
128 |
### Training results
|
129 |
|
130 |
+
Accuracy ~80%
|
131 |
|
132 |
### Framework versions
|
133 |
|
|
|
135 |
- Transformers 4.43.3
|
136 |
- Pytorch 2.2.1+cu121
|
137 |
- Datasets 2.20.0
|
138 |
+
- Tokenizers 0.19.1
|