Qwen
/

yangapku commited on
Commit
5a5440a
1 Parent(s): 5e2d26c

update readme

Browse files
Files changed (2) hide show
  1. README.md +5 -5
  2. modeling_qwen.py +5 -0
README.md CHANGED
@@ -25,13 +25,13 @@ inference: false
25
 
26
  ## 介绍(Introduction)
27
 
28
- **通义千问-7B(Qwen-7B)**是阿里云研发的通义千问大模型系列的70亿参数规模的模型。Qwen-7B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。同时,在Qwen-7B的基础上,我们使用对齐机制打造了基于大语言模型的AI助手Qwen-7B-Chat。相较于最初开源的Qwen-7B模型,我们现已将预训练模型和Chat模型更新到效果更优的Qwen-7B v1.1版本(除表格中特殊注明的结果外,以下正文中Qwen-7B-Chat均代指Qwen-7B-Chat v1.1)。本仓库为Qwen-7B-Chat v1.1的仓库。
29
 
30
  如果您想了解更多关于通义千问-7B开源模型的细节,我们建议您参阅[Github代码库](https://github.com/QwenLM/Qwen)。
31
 
32
- **Qwen-7B** is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-7B, we release Qwen-7B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. Now we have updated both our pretrained and chat model to Qwen-7B v1.1 version with better performances. This repository is the one for Qwen-7B-Chat v1.1.
33
 
34
- For more details about the open-source model of Qwen-7B, please refer to the [Github](https://github.com/QwenLM/Qwen) code repository.
35
  <br>
36
 
37
  ## 要求(Requirements)
@@ -207,9 +207,9 @@ The above speed and memory profiling are conducted using [this script](https://q
207
 
208
  ## 模型细节(Model)
209
 
210
- 与Qwen-7B预训练模型相同,Qwen-7B-Chat模型规模基本情况如下所示
211
 
212
- The details of the model architecture of Qwen-7B-Chat (v1.1 version) are listed as follows:
213
 
214
  | Hyperparameter | Value |
215
  | :------------- | :----: |
 
25
 
26
  ## 介绍(Introduction)
27
 
28
+ **通义千问-7B(Qwen-7B)**是阿里云研发的通义千问大模型系列的70亿参数规模的模型。Qwen-7B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。同时,在Qwen-7B的基础上,我们使用对齐机制打造了基于大语言模型的AI助手Qwen-7B-Chat。相较于最初开源的Qwen-7B模型,我们现已将预训练模型和Chat模型更新到效果更优的版本。本仓库为Qwen-7B-Chat的仓库。
29
 
30
  如果您想了解更多关于通义千问-7B开源模型的细节,我们建议您参阅[Github代码库](https://github.com/QwenLM/Qwen)。
31
 
32
+ **Qwen-7B** is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-7B, we release Qwen-7B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. Now we have updated both our pretrained and chat models with better performances. This repository is the one for Qwen-7B-Chat.
33
 
34
+ For more details about Qwen, please refer to the [Github](https://github.com/QwenLM/Qwen) code repository.
35
  <br>
36
 
37
  ## 要求(Requirements)
 
207
 
208
  ## 模型细节(Model)
209
 
210
+ 与Qwen-7B预训练模型相同,Qwen-7B-Chat模型规模基本情况如下所示:
211
 
212
+ The details of the model architecture of Qwen-7B-Chat are listed as follows:
213
 
214
  | Hyperparameter | Value |
215
  | :------------- | :----: |
modeling_qwen.py CHANGED
@@ -861,6 +861,11 @@ class QWenLMHeadModel(QWenPreTrainedModel):
861
  assert (
862
  config.bf16 + config.fp16 + config.fp32 <= 1
863
  ), "Only one of \"bf16\", \"fp16\", \"fp32\" can be true"
 
 
 
 
 
864
 
865
  autoset_precision = config.bf16 + config.fp16 + config.fp32 == 0
866
 
 
861
  assert (
862
  config.bf16 + config.fp16 + config.fp32 <= 1
863
  ), "Only one of \"bf16\", \"fp16\", \"fp32\" can be true"
864
+ logger.warn(
865
+ "Warning: please make sure that you are using the latest codes and checkpoints, "
866
+ "especially if you used Qwen-7B before 09.25.2023."
867
+ "请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。"
868
+ )
869
 
870
  autoset_precision = config.bf16 + config.fp16 + config.fp32 == 0
871