BAAI
/

Anhforth commited on
Commit
4388bdd
·
1 Parent(s): 0100465

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -11
README.md CHANGED
@@ -31,6 +31,13 @@ We also support [Huggingface](hflink)
31
 
32
 
33
  ## 模型细节/Model details
 
 
 
 
 
 
 
34
 
35
  我们使用了一系列更高效的底层算子来辅助模型训练,其中包括参考[flash-attention](https://github.com/HazyResearch/flash-attention)的方法并替换了一些中间计算,同时还使用了RMSNorm。在此基础上,我们应用了[BMtrain](https://github.com/OpenBMB/BMTrain)技术进行轻量化的并行训练,该技术采用了数据并行、ZeRO(零冗余优化器)、优化器卸载、检查点和操作融合、通信-计算重叠等方法来优化模型训练过程。
36
 
@@ -51,20 +58,12 @@ We used different tokenizers to extract ten thousand data samples from English,
51
  | llama | 32000 | sp(bpe)|1805| 1257|1970 |
52
  | gpt2_new_100k | 100000 | bpe|1575 | 477|1679 |
53
 
54
-
55
-
56
- 模型在一台8卡Nvidia A100上训练8小时,总共对15万条数据训练了3个epoch。
57
-
58
- The model was trained on an 8-card Nvidia A100 for 8 hours, and a total of 150,000 lines of data were trained for 3 epochs.
59
-
60
  ## 训练数据集/Training data
61
 
62
  我们采用了一系列高质量中英文数据集来训练和微调我们的对话语言模型,并且在不断更新迭代
63
 
64
  We used a series of high-quality Chinese and English datasets to train and fine-tune our conversational language model, and continuously updated it through iterations.
65
 
66
- ![Screenshot](../img/data.jpg)
67
-
68
 
69
  ## 使用方式/How to use
70
 
@@ -204,7 +203,7 @@ Create a new directory named `aquila-7b` inside `./checkpoints_in`. Place the fi
204
 
205
  #### Step 3: 启动可监督微调/Start SFT
206
  ```
207
- bash dist_trigger_docker.sh hostfile aquila-sft.yaml aquila-7b [实验名]
208
  ```
209
  接下来会输出下列信息,注意`NODES_NUM`应该与节点数相等,`LOGFILE`是模型运行的日志文件;The following information will be output. Note that `NODES_NUM` should be equal to the number of nodes, and `LOGFILE` is the log file for the model run.
210
 
@@ -217,7 +216,7 @@ bash dist_trigger_docker.sh hostfile aquila-sft.yaml aquila-7b [实验名]
217
 
218
  ## 证书/License
219
 
220
- Aquila-7B开源模型使用 [智源Aquila系列模型许可协议](linkhere), 原始代码基于[Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0)
221
 
222
 
223
- Aquila-7B open-source model is licensed under [ BAAI Aquila Model Licence Agreement](linkhere). The source code is under [Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0)
 
31
 
32
 
33
  ## 模型细节/Model details
34
+ | Model | License | Commercial use? | GPU | Model link
35
+ | :---------------- | :------- | :-- |:-- | :-- |
36
+ |Aquila-7B | Apache 2.0 | ✅ | Nvidia-A100 | https://model.baai.ac.cn/model-detail/100098
37
+ | AquilaCode-7B-nv | Apache 2.0 | ✅ | Nvidia-A100 | https://model.baai.ac.cn/model-detail/100102
38
+ | AquilaCode-7B-ts | Apache 2.0 | ✅ | Tianshu-BI-V100 | https://model.baai.ac.cn/model-detail/100099
39
+ | AquilaChat-7B | Apache 2.0 | ✅ | Nvidia-A100 | https://model.baai.ac.cn/model-detail/100101
40
+
41
 
42
  我们使用了一系列更高效的底层算子来辅助模型训练,其中包括参考[flash-attention](https://github.com/HazyResearch/flash-attention)的方法并替换了一些中间计算,同时还使用了RMSNorm。在此基础上,我们应用了[BMtrain](https://github.com/OpenBMB/BMTrain)技术进行轻量化的并行训练,该技术采用了数据并行、ZeRO(零冗余优化器)、优化器卸载、检查点和操作融合、通信-计算重叠等方法来优化模型训练过程。
43
 
 
58
  | llama | 32000 | sp(bpe)|1805| 1257|1970 |
59
  | gpt2_new_100k | 100000 | bpe|1575 | 477|1679 |
60
 
 
 
 
 
 
 
61
  ## 训练数据集/Training data
62
 
63
  我们采用了一系列高质量中英文数据集来训练和微调我们的对话语言模型,并且在不断更新迭代
64
 
65
  We used a series of high-quality Chinese and English datasets to train and fine-tune our conversational language model, and continuously updated it through iterations.
66
 
 
 
67
 
68
  ## 使用方式/How to use
69
 
 
203
 
204
  #### Step 3: 启动可监督微调/Start SFT
205
  ```
206
+ bash dist_trigger_docker.sh hostfile aquila-sft.yaml aquilachat-7b [实验名]
207
  ```
208
  接下来会输出下列信息,注意`NODES_NUM`应该与节点数相等,`LOGFILE`是模型运行的日志文件;The following information will be output. Note that `NODES_NUM` should be equal to the number of nodes, and `LOGFILE` is the log file for the model run.
209
 
 
216
 
217
  ## 证书/License
218
 
219
+ Aquila-7B开源模型使用 [智源Aquila系列模型许可协议](https://huggingface.co/BAAI/AquilaCode-7B-NV/resolve/main/BAAI%20Aquila%20Model%20License%20Agreement.pdf), 原始代码基于[Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0)
220
 
221
 
222
+ Aquila-7B open-source model is licensed under [ BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/AquilaCode-7B-NV/resolve/main/BAAI%20Aquila%20Model%20License%20Agreement.pdf). The source code is under [Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0)