|
--- |
|
datasets: |
|
- HuggingFaceH4/ultrachat_200k |
|
language: |
|
- en |
|
license: other |
|
base_model: Qwen/Qwen-1_8B |
|
--- |
|
# Model description |
|
The model was trained on about 6560 examples of the HuggingFaceH4/ultrachat_200k dataset, with plans to release more checkpoints later on. |
|
|
|
This model has not been aligned with DPO. In the future, different repositories will be released that contain versions of this model aligned with DPO, using various datasets. |
|
|
|
# Evaluation |
|
Upon personal testing, the model demonstrates excellent performance in mathematics, history, and coding tasks. |
|
|
|
However, due to the model's requirement for the setting trust_remote_code=True, it cannot be submitted to the Open LLM Leaderboard. As a result, I will release a llama-fied version of this model that can be submitted to the Open LLM leaderboard. |
|
|
|
# Recommended prompt format |
|
``` |
|
<|im_start|>system message <|im_end|> <|im_start|>user message <|im_end|> <|im_start|>assistant message <|im_end|> |
|
``` |
|
|
|
I recommend passing eos_token="<|im_end|>" when initializing the tokenizer. |
|
|
|
# Recommended inference parameters |
|
|
|
temperature=0.2, top_p=0.14, top_k=12, repetition_penalty=1.1 |
|
|
|
# License |
|
|
|
Please make sure to read the Qwen licensing agreement before using this model. |