|
--- |
|
datasets: |
|
- M4-ai/Rhino |
|
language: |
|
- en |
|
license: other |
|
base_model: Qwen/Qwen-1_8B |
|
--- |
|
# Model description |
|
The model was trained on approximately 200,000 examples from the M4-ai/Rhino dataset, with no examples omitted. It underwent eight checkpoints during training, with the second checkpoint identified as the best-performing one (after processing approximately 50,000 examples). |
|
|
|
This model has not been aligned with DPO. In the future, different repositories will be released that contain versions of this model aligned with DPO, using various datasets. |
|
|
|
# Evaluation |
|
Upon personal testing, the model demonstrates excellent performance in mathematics, history, and coding tasks. |
|
|
|
However, due to the model's requirement for the setting trust_remote_code=True, it cannot be submitted to the Open LLM Leaderboard. As a result, I will conduct my own evaluation of its capabilities. |
|
|
|
# Recommended prompt format |
|
``` |
|
<|im_start|>system message <|im_end|> <|im_start|>user message <|im_end|> <|im_start|>assistant message <|im_end|> |
|
``` |
|
|
|
I recommend passing eos_token="<|im_end|>" when initializing the tokenizer. |
|
|
|
# Recommended inference parameters |
|
|
|
temperature=0.2, top_p=0.14, top_k=12, repetition_penalty=1.1 |
|
|
|
# License |
|
|
|
Please make sure to read the Qwen licensing agreement before using this model. |