xverse
/

XVERSE-65B

Text Generation

Transformers

PyTorch

xverse

custom_code

Model card Files Files and versions Community

ChloeAuYeung commited on Nov 24, 2023

Commit

89c10b1

•

1 Parent(s): 30040fc

Update README.md

Browse files

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -26,7 +26,7 @@ inference: false
 在预训练阶段，**XVERSE-65B** 主要使用了 7 类不同的数据类型。以下表格展示了 XVERSE-65B 与其他一些知名模型在预训练数据集方面的比较：
-| 数据类别 | [GPT3](https://arxiv.org/abs/2005.14165) | [Llama](https://arxiv.org/abs/2302.13971) | [BLOOM](https://arxiv.org/abs/2211.05100) | [PaLM](https://arxiv.org/abs/2204.02311) | [Chinchilla](https://arxiv.org/pdf/2203.15556) | [Gopher](https://arxiv.org/abs/2112.11446) | [MT-NLG](https://arxiv.org/abs/2201.11990) | XVERSE-65B |
 |:-------:|:--------:|:---------:|:---------:|:--------:|:--------------:|:----------:|:----------:|:----------:|
 | 网页类   | Y        | Y         | Y         | Y        | Y              | Y          | Y          | Y          |
 | 代码类   |          | Y         | Y         | Y        | Y              | Y          | Y          | Y          |
@@ -53,7 +53,7 @@ inference: false
 During the pre-training phase, **XVERSE-65B** primarily utilized 7 different types of data. The following table shows a comparison of the pre-training datasets of XVERSE-65B with some other well-known models:
-| Data Type | [GPT3](https://arxiv.org/abs/2005.14165) | [Llama](https://arxiv.org/abs/2302.13971) | [BLOOM](https://arxiv.org/abs/2211.05100) | [PaLM](https://arxiv.org/abs/2204.02311) | [Chinchilla](https://arxiv.org/pdf/2203.15556) | [Gopher](https://arxiv.org/abs/2112.11446) | [MT-NLG](https://arxiv.org/abs/2201.11990) | XVERSE-65B |
 |:---------------:|:--------:|:---------:|:---------:|:--------:|:--------------:|:----------:|:----------:|:----------:|
 | Web Pages       | Y        | Y         | Y         | Y        | Y              | Y          | Y          | Y          |
 | Code            |          | Y         | Y         | Y        | Y              | Y          | Y          | Y          |

 在预训练阶段，**XVERSE-65B** 主要使用了 7 类不同的数据类型。以下表格展示了 XVERSE-65B 与其他一些知名模型在预训练数据集方面的比较：
+| 数据类别 | [GPT3](https://arxiv.org/abs/2005.14165) | [Llama](https://arxiv.org/abs/2302.13971) | [BLOOM](https://arxiv.org/abs/2211.05100) | [PaLM](https://arxiv.org/abs/2204.02311) | [Chinchilla](https://arxiv.org/abs/2203.15556) | [Gopher](https://arxiv.org/abs/2112.11446) | [MT-NLG](https://arxiv.org/abs/2201.11990) | XVERSE-65B |
 |:-------:|:--------:|:---------:|:---------:|:--------:|:--------------:|:----------:|:----------:|:----------:|
 | 网页类   | Y        | Y         | Y         | Y        | Y              | Y          | Y          | Y          |
 | 代码类   |          | Y         | Y         | Y        | Y              | Y          | Y          | Y          |
 During the pre-training phase, **XVERSE-65B** primarily utilized 7 different types of data. The following table shows a comparison of the pre-training datasets of XVERSE-65B with some other well-known models:
+| Data Type | [GPT3](https://arxiv.org/abs/2005.14165) | [Llama](https://arxiv.org/abs/2302.13971) | [BLOOM](https://arxiv.org/abs/2211.05100) | [PaLM](https://arxiv.org/abs/2204.02311) | [Chinchilla](https://arxiv.org/abs/2203.15556) | [Gopher](https://arxiv.org/abs/2112.11446) | [MT-NLG](https://arxiv.org/abs/2201.11990) | XVERSE-65B |
 |:---------------:|:--------:|:---------:|:---------:|:--------:|:--------------:|:----------:|:----------:|:----------:|
 | Web Pages       | Y        | Y         | Y         | Y        | Y              | Y          | Y          | Y          |
 | Code            |          | Y         | Y         | Y        | Y              | Y          | Y          | Y          |