BAAI
/

AquilaChat-7B

Inference Endpoints

Model card Files Files and versions Community

Anhforth commited on Jun 9, 2023

Commit

8410d8a

·

1 Parent(s): ad05ad2

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -19,12 +19,12 @@ AquilaChat模型主要为了验证基础模型能力，您可以根据自己需
 The AquilaChat model was primarily developed to verify the capabilities of the foundational model. You can use, modify, and commercialize the model according to your needs, but you must comply with all applicable laws and regulations in your country. Additionally, you must provide the source of the Aquila series models and a copy of the Aquila series model lincense to any third-party users.
 ## 模型细节/Model details
-|   Model          |  License    | Commercial use?  |  GPU   | Model link
-| :---------------- | :------- | :-- |:-- | :-- |
-|Aquila-7B        | Apache 2.0  |  ✅   | Nvidia-A100  | https://model.baai.ac.cn/model-detail/100098
-| AquilaCode-7B-nv          | Apache 2.0  |  ✅   |   Nvidia-A100   | https://model.baai.ac.cn/model-detail/100102
-| AquilaCode-7B-ts           | Apache 2.0  |  ✅    |  Tianshu-BI-V100   | https://model.baai.ac.cn/model-detail/100099
-| AquilaChat-7B           | Apache 2.0  |  ✅    | Nvidia-A100  | https://model.baai.ac.cn/model-detail/100101
 我们使用了一系列更高效的底层算子来辅助模型训练，其中包括参考[flash-attention](https://github.com/HazyResearch/flash-attention)的方法并替换了一些中间计算，同时还使用了RMSNorm。在此基础上，我们应用了[BMtrain](https://github.com/OpenBMB/BMTrain)技术进行轻量化的并行训练，该技术采用了数据并行、ZeRO（零冗余优化器）、优化器卸载、检查点和操作融合、通信-计算重叠等方法来优化模型训练过程。

 The AquilaChat model was primarily developed to verify the capabilities of the foundational model. You can use, modify, and commercialize the model according to your needs, but you must comply with all applicable laws and regulations in your country. Additionally, you must provide the source of the Aquila series models and a copy of the Aquila series model lincense to any third-party users.
 ## 模型细节/Model details
+|   Model          |  License    | Commercial use?  |  GPU
+| :---------------- | :------- | :-- |:-- |
+| Aquila-7B         | Apache 2.0  |  ✅   | Nvidia-A100  |
+| AquilaCode-7B-NV          | Apache 2.0  |  ✅   |   Nvidia-A100   |
+| AquilaCode-7B-TS           | Apache 2.0  |  ✅    |  Tianshu-BI-V100   |
+| AquilaChat-7B           | Apache 2.0  |  ✅    | Nvidia-A100  |
 我们使用了一系列更高效的底层算子来辅助模型训练，其中包括参考[flash-attention](https://github.com/HazyResearch/flash-attention)的方法并替换了一些中间计算，同时还使用了RMSNorm。在此基础上，我们应用了[BMtrain](https://github.com/OpenBMB/BMTrain)技术进行轻量化的并行训练，该技术采用了数据并行、ZeRO（零冗余优化器）、优化器卸载、检查点和操作融合、通信-计算重叠等方法来优化模型训练过程。