IvanHU commited on
Commit
3ce9833
·
verified ·
1 Parent(s): 5dda598

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ pipeline_tag: text-classification
6
+ library_name: fasttext
7
+ tags:
8
+ - code
9
+ ---
10
+
11
+
12
+ ## Code classifier
13
+
14
+ We use `code-classifier` to retrieve math-related content from `fineweb-edu`, `dclm`, ... to upsample code-related content
15
+
16
+
17
+ ## Related resources
18
+
19
+ - [Math classifier](https://huggingface.co/yulan-team/math-classifier)
20
+ - [Code classifier](https://huggingface.co/yulan-team/code-classifier)
21
+ - [Reasoning classifier](https://huggingface.co/yulan-team/reasoning-classifier)
22
+
23
+
24
+ ---
25
+
26
+ ## Contributing
27
+
28
+ We welcome any form of contribution, including feedback on model bad cases, feature suggestions, and example contributions. You can do so by submitting an [issue](https://github.com/RUC-GSAI/YuLan-Mini/issues).
29
+
30
+ ## The Team
31
+
32
+ YuLan-Mini is developed and maintained by [AI Box, Renmin University of China](http://aibox.ruc.edu.cn/).
33
+
34
+ ## License
35
+
36
+ - The code in this repository, the model weights, and optimizer states are released under the [MIT License](./LICENSE).
37
+ - Policies regarding the use of model weights, intermediate optimizer states, and training data will be announced in future updates.
38
+ - Limitations: Despite our efforts to mitigate safety concerns and encourage the generation of ethical and lawful text, the probabilistic nature of language models may still lead to unexpected outputs. For instance, responses might contain bias, discrimination, or other harmful content. Please refrain from disseminating such content. We are not liable for any consequences arising from the spread of harmful information.
39
+
40
+ ## Citation
41
+
42
+ If you find YuLan-Mini helpful for your research or development, please cite [our technical report](https://arxiv.org/abs/2412.17743):
43
+
44
+
45
+ ```
46
+ @article{hu2024yulan,
47
+ title={YuLan-Mini: An Open Data-efficient Language Model},
48
+ author={Hu, Yiwen and Song, Huatong and Deng, Jia and Wang, Jiapeng and Chen, Jie and Zhou, Kun and Zhu, Yutao and Jiang, Jinhao and Dong, Zican and Zhao, Wayne Xin and others},
49
+ journal={arXiv preprint arXiv:2412.17743},
50
+ year={2024}
51
+ }
52
+ ```