piuzha commited on
Commit
5f40679
·
verified ·
1 Parent(s): 1604958

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -65,7 +65,7 @@ print(sequences[0]['generated_text'])
65
 
66
  ## Evaluation
67
 
68
- We test the performance of our model with [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). The evaluation results on common datasets are shown below. We test on AI2 Reasoning Challenge (25-shot), HellaSwag (10-shot), MMLU (5-shot), and Winogrande (5-shot).
69
 
70
  | Models | ARC-C | Hellaswag | MMLU | WinoGrade | Ave |
71
  |:----------------------:|:-----:|:---------:|:-----:|:---------:|:-----:|
@@ -100,7 +100,15 @@ We also test the zero shot performance on AI2 Reasoning Challenge (0-shot), AI2
100
  | Moxin-7B-finetune | 80.03 | 75.17 | 82.24 | 81.12 | 58.64 | 75.44 |
101
 
102
 
103
-
 
 
 
 
 
 
 
 
104
 
105
 
106
 
 
65
 
66
  ## Evaluation
67
 
68
+ We test the performance of our model with [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). The evaluation results on common datasets are shown below. We test on AI2 Reasoning Challenge (25-shot), HellaSwag (10-shot), MMLU (5-shot), and Winogrande (5-shot). We release the Moxin-7B-finetuned as our base model. We further finetune our base model on Tulu v2 to obtain our chat model.
69
 
70
  | Models | ARC-C | Hellaswag | MMLU | WinoGrade | Ave |
71
  |:----------------------:|:-----:|:---------:|:-----:|:---------:|:-----:|
 
100
  | Moxin-7B-finetune | 80.03 | 75.17 | 82.24 | 81.12 | 58.64 | 75.44 |
101
 
102
 
103
+ ## Citation
104
+ ```
105
+ @article{zhao2024fully,
106
+ title={Fully Open Source Moxin-7B Technical Report},
107
+ author={Zhao, Pu and Shen, Xuan and Kong, Zhenglun and Shen, Yixin and Chang, Sung-En and Rupprecht, Timothy and Lu, Lei and Nan, Enfu and Yang, Changdi and He, Yumei and others},
108
+ journal={arXiv preprint arXiv:2412.06845},
109
+ year={2024}
110
+ }
111
+ ```
112
 
113
 
114