XuYipei commited on
Commit
82c44b6
1 Parent(s): 2228338

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -157
README.md CHANGED
@@ -1,199 +1,126 @@
1
- ---
2
- library_name: transformers
3
- tags: []
4
- ---
5
 
6
- # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
9
 
 
10
 
 
 
 
11
 
12
- ## Model Details
13
 
14
- ### Model Description
 
 
15
 
16
- <!-- Provide a longer summary of what this model is. -->
17
 
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 
 
 
 
 
 
 
 
 
19
 
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
- ### Model Sources [optional]
 
 
 
29
 
30
- <!-- Provide the basic links for the model. -->
 
31
 
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
- ## Uses
 
 
 
37
 
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
 
40
- ### Direct Use
41
 
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
43
 
44
- [More Information Needed]
45
 
46
- ### Downstream Use [optional]
47
 
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
49
 
50
- [More Information Needed]
51
 
52
- ### Out-of-Scope Use
53
 
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
 
 
55
 
56
- [More Information Needed]
57
 
58
- ## Bias, Risks, and Limitations
59
 
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 
 
 
 
 
61
 
62
- [More Information Needed]
63
 
64
- ### Recommendations
65
 
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
 
67
 
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
 
70
- ## How to Get Started with the Model
71
 
72
- Use the code below to get started with the model.
 
 
 
73
 
74
- [More Information Needed]
75
 
76
- ## Training Details
77
 
78
- ### Training Data
 
 
 
79
 
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
 
82
- [More Information Needed]
83
 
84
- ### Training Procedure
 
85
 
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
 
88
- #### Preprocessing [optional]
89
 
90
- [More Information Needed]
 
91
 
 
 
92
 
93
- #### Training Hyperparameters
94
-
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
102
-
103
- ## Evaluation
104
-
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
- #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
120
-
121
- #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
-
127
- ### Results
128
-
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Contact
198
-
199
- [More Information Needed]
 
1
+ ## 模型描述
 
 
 
2
 
3
+ 使用 [hfl/chinese-llama-2-7b · Hugging Face](https://huggingface.co/hfl/chinese-llama-2-7b) 作为中文分词器,训练的 Mixtral-4x7B-MoE 模型。
4
 
5
+ 可以在单卡 A100 上推理,在 8xA100 上全量微调。
6
 
7
+ ## 部分评测指标
8
 
9
+ | MMLU | CMMLU | C-Eval | GSM8K | MBPP |
10
+ | ----- | ----- | ------ | ----- | ---- |
11
+ | 55.13 | 51.10 | 52.0 | 67.17 | 40.2 |
12
 
13
+ ## 使用方式
14
 
15
+ ```python
16
+ import torch
17
+ import transformers
18
 
 
19
 
20
+ def apply_multi_turn_template(history, input):
21
+ multi_turn_template = "[ROUND {} USER]{}[ROUND {} ASSISTANT]{}"
22
+ prefix = ""
23
+ for i in range(len(history)):
24
+ prefix = prefix + multi_turn_template.format(i, history[i][0], i, history[i][1])
25
+ prefix = prefix + "<|end_of_turn|>"
26
+ curr_turn_id = len(history)
27
+ curr_turn = "[ROUND {} USER]{}[ROUND {} ASSISTANT]".format(curr_turn_id, input, curr_turn_id)
28
+ return prefix + curr_turn
29
+
30
 
31
+ model = transformers.AutoModelForCausalLM.from_pretrained("XuYipei/Mixtral-4x7B-Chat-Chinese")
32
+ tokenizer = transformers.AutoTokenizer.from_pretrained("XuYipei/Mixtral-4x7B-Chat-Chinese")
 
 
 
 
 
33
 
34
+ model = model.to(torch.bfloat16).to("cuda")
35
+ input_text = apply_multi_turn_template([], "列举出10首杜甫的古诗,同时默写该诗篇的完整内容。")
36
+ output = md.generate(input_ids = torch.LongTensor([tk.encode(input_text)]).to("cuda:5"), eos_token_id=55296, max_length=4096)
37
+ output_text = tokenizer.decode(output[0])
38
 
39
+ '''
40
+ <s> [ROUND 0 USER]列举出10首杜甫的古诗,同时默写该诗篇的完整内容。[ROUND 0 ASSISTANT] 1. 《登高》
41
 
42
+ 唐·杜甫
 
 
43
 
44
+ 风急天高猿啸哀,渚清沙白鸟飞回。
45
+ 无边落木萧萧下,不尽长江滚滚来。
46
+ 万里悲秋常作客,百年多病独登台。
47
+ 艰难苦恨繁霜鬓,潦倒新停浊酒杯。
48
 
49
+ 2. 《春夜喜雨》
50
 
51
+ 唐·杜甫
52
 
53
+ 好雨知时节,当春乃发生。
54
+ 随风潜入夜,润物细无声。
55
+ 野径云俱黑,江船火独明。
56
+ 晓看红湿处,花重锦官城。
57
 
58
+ 3. 《绝句》
59
 
60
+ 唐·杜甫
61
 
62
+ 迟日江山丽,春风花草香。
63
+ 泥融飞燕子,沙暖睡鸳鸯。
64
 
65
+ 4. 《登楼》
66
 
67
+ 唐·杜甫
68
 
69
+ 花近高楼伤客心,万方多难此登临。
70
+ 锦江春色来天地,玉垒浮云变古今。
71
+ 北极朝廷终不改,西山寇盗莫相侵。
72
+ 可怜后主还祠庙,日暮聊为梁甫吟。
73
 
74
+ 5. 《茅屋为秋风所破歌》
75
 
76
+ 唐·杜甫
77
 
78
+ 八月秋高风怒号,卷我屋上三重茅。
79
+ 茅飞渡江洒江郊,高者挂罥长林梢,
80
+ 下者飘转沉塘坳。
81
+ 南村群童欺我老无力,忍能对面为盗贼。
82
+ 公然抱茅入竹去,唇焦口燥呼不得,
83
+ 归来倚杖自叹息。
84
 
85
+ 6. 《江南逢李龟年》
86
 
87
+ 唐·杜甫
88
 
89
+ 岐王宅里寻常见,崔九堂前几度闻。
90
+ 正是江南好风景,落花时节又逢君。
91
 
92
+ 7. 《闻官军收河南河北》
93
 
94
+ 唐·杜甫
95
 
96
+ 剑外忽传收蓟北,初闻涕泪满衣裳。
97
+ 却看妻子愁何在,漫卷诗书喜欲��。
98
+ 白日放歌须纵酒,青春作伴好还乡。
99
+ 即从巴峡穿巫峡,便下襄阳向洛阳。
100
 
101
+ 8. 《旅夜书怀》
102
 
103
+ 唐·杜甫
104
 
105
+ 细草微风岸,危樯独夜舟。
106
+ 星垂平野阔,月涌大江流。
107
+ 名岂文章著,官应老病休。
108
+ 飘飘何所似,天地一沙鸥。
109
 
110
+ 9. 《江畔独步寻花》
111
 
112
+ 唐·杜甫
113
 
114
+ 黄四娘家花满蹊,千朵万朵压枝低。
115
+ 留恋戏蝶时时舞,自在娇莺恰恰啼。
116
 
117
+ 10. 《赠花卿》
118
 
119
+ 唐·杜甫
120
 
121
+ 锦城丝管日纷纷,半入江风半入云。
122
+ 此曲只应天上有,人间能得几回闻。<|end_of_turn|>
123
 
124
+ '''
125
+ ```
126