Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ tags:
|
|
16 |
</div>
|
17 |
|
18 |
<p align="center">
|
19 |
-
<a href="
|
20 |
<a href="https://github.com/OpenBMB/OmniLMM/" target="_blank">OmniLMM 多模态模型 Multi-modal Model</a> |
|
21 |
<a href="https://luca.cn/" target="_blank">CPM-C 千亿模型试用 ~100B Model Trial </a>
|
22 |
</p>
|
@@ -51,15 +51,24 @@ We release all model parameters for research and limited commercial use. We also
|
|
51 |
- The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
|
52 |
- Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
|
53 |
|
|
|
54 |
|
|
|
55 |
|
56 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
57 |
|
58 |
- 受限于模型规模,模型可能出现幻觉性问题。其中由于DPO模型生成的回复内容更长,更容易出现幻觉。我们也将持续进行MiniCPM模型的迭代改进;
|
59 |
- 为了保证在学术研究用途上模型的通用性,我们未对模型进行任何身份认同训练。同时由于我们用ShareGPT开源语料作为部分训练数据,模型可能会输出类似GPT系列模型的身份认同信息;
|
60 |
- 受限于模型规模,模型的输出受到提示词(prompt)的影响较大,可能多次尝试产生不一致的结果;
|
61 |
- 受限于模型容量,模型的知识记忆较不准确,后续我们将结合RAG方法来增强模型的知识记忆能力。
|
62 |
-
|
63 |
- Due to limitations in model size, the model may experience hallucinatory issues. As DPO model tend to generate longer response, hallucinations are more likely to occur. We will also continue to iterate and improve the MiniCPM model.
|
64 |
- To ensure the universality of the model for academic research purposes, we did not conduct any identity training on the model. Meanwhile, as we use ShareGPT open-source corpus as part of the training data, the model may output identity information similar to the GPT series models.
|
65 |
- Due to the limitation of model size, the output of the model is greatly influenced by prompt words, which may result in inconsistent results from multiple attempts.
|
@@ -130,8 +139,8 @@ print(responds)
|
|
130 |
|
131 |
## 工作引用 Citation
|
132 |
|
133 |
-
* 如果觉得MiniCPM有助于您的工作,请考虑引用下列[技术报告](
|
134 |
-
* Please cite our [techinical report]() if you find our work valuable.
|
135 |
|
136 |
```
|
137 |
@inproceedings{minicpm2024,
|
|
|
16 |
</div>
|
17 |
|
18 |
<p align="center">
|
19 |
+
<a href="https://shengdinghu.notion.site/MiniCPM-c805a17c5c8046398914e47f0542095a?pvs=4" target="_blank">MiniCPM 技术报告</a><a href="https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20?pvs=4" target="_blank"> Technical Report</a> |
|
20 |
<a href="https://github.com/OpenBMB/OmniLMM/" target="_blank">OmniLMM 多模态模型 Multi-modal Model</a> |
|
21 |
<a href="https://luca.cn/" target="_blank">CPM-C 千亿模型试用 ~100B Model Trial </a>
|
22 |
</p>
|
|
|
51 |
- The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
|
52 |
- Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
|
53 |
|
54 |
+
### 评测结果 Evaluation Results
|
55 |
|
56 |
+
详细的评测结果位于[github仓库](https://github.com/OpenBMB/MiniCPM?tab=readme-ov-file#%E8%AF%84%E6%B5%8B%E7%BB%93%E6%9E%9C)
|
57 |
|
58 |
+
Detailed evaluation results are in [github repo](https://github.com/OpenBMB/MiniCPM/blob/main/README-en.md#evaluation-results)
|
59 |
+
|
60 |
+
注意:我们发现使用Huggingface生成质量略差于vLLM,因此推荐使用vLLM进行测试。我们正在排查原因。
|
61 |
+
|
62 |
+
Notice: We discovered that the quality of Huggingface generation is slightly lower than vLLM, thus benchmarking using vLLM is recommended.
|
63 |
+
We are investigating the cause now.
|
64 |
+
|
65 |
+
### 局限性 Limitations
|
66 |
|
67 |
- 受限于模型规模,模型可能出现幻觉性问题。其中由于DPO模型生成的回复内容更长,更容易出现幻觉。我们也将持续进行MiniCPM模型的迭代改进;
|
68 |
- 为了保证在学术研究用途上模型的通用性,我们未对模型进行任何身份认同训练。同时由于我们用ShareGPT开源语料作为部分训练数据,模型可能会输出类似GPT系列模型的身份认同信息;
|
69 |
- 受限于模型规模,模型的输出受到提示词(prompt)的影响较大,可能多次尝试产生不一致的结果;
|
70 |
- 受限于模型容量,模型的知识记忆较不准确,后续我们将结合RAG方法来增强模型的知识记忆能力。
|
71 |
+
|
72 |
- Due to limitations in model size, the model may experience hallucinatory issues. As DPO model tend to generate longer response, hallucinations are more likely to occur. We will also continue to iterate and improve the MiniCPM model.
|
73 |
- To ensure the universality of the model for academic research purposes, we did not conduct any identity training on the model. Meanwhile, as we use ShareGPT open-source corpus as part of the training data, the model may output identity information similar to the GPT series models.
|
74 |
- Due to the limitation of model size, the output of the model is greatly influenced by prompt words, which may result in inconsistent results from multiple attempts.
|
|
|
139 |
|
140 |
## 工作引用 Citation
|
141 |
|
142 |
+
* 如果觉得MiniCPM有助于您的工作,请考虑引用下列[技术报告](https://shengdinghu.notion.site/MiniCPM-c805a17c5c8046398914e47f0542095a?pvs=4)
|
143 |
+
* Please cite our [techinical report](https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20?pvs=4) if you find our work valuable.
|
144 |
|
145 |
```
|
146 |
@inproceedings{minicpm2024,
|