Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
---
|
2 |
-
license:
|
3 |
datasets:
|
4 |
- irds/codesearchnet
|
5 |
- giganticode/java-cmpx-v1
|
@@ -84,6 +84,10 @@ datasets:
|
|
84 |
- seyyedaliayati/solidity-dataset
|
85 |
- jkhedri/psychology-dataset
|
86 |
- KonradSzafer/stackoverflow_linux
|
|
|
|
|
|
|
|
|
87 |
language:
|
88 |
- en
|
89 |
- it
|
@@ -93,6 +97,9 @@ language:
|
|
93 |
- ru
|
94 |
- ro
|
95 |
- el
|
|
|
|
|
|
|
96 |
metrics:
|
97 |
- accuracy
|
98 |
- bertscore
|
@@ -100,6 +107,8 @@ metrics:
|
|
100 |
- code_eval
|
101 |
- character
|
102 |
- brier_score
|
|
|
|
|
103 |
tags:
|
104 |
- code
|
105 |
- text-generation-inference
|
@@ -126,7 +135,7 @@ Aiden is a factual language model from Hugging Face, trained on a massive datase
|
|
126 |
|
127 |
### Model Specifications
|
128 |
|
129 |
-
Aiden is a Transformer-based LLM with
|
130 |
|
131 |
* Books
|
132 |
* Code
|
|
|
1 |
---
|
2 |
+
license: mit
|
3 |
datasets:
|
4 |
- irds/codesearchnet
|
5 |
- giganticode/java-cmpx-v1
|
|
|
84 |
- seyyedaliayati/solidity-dataset
|
85 |
- jkhedri/psychology-dataset
|
86 |
- KonradSzafer/stackoverflow_linux
|
87 |
+
- vikp/textbook_quality_programming
|
88 |
+
- rombodawg/LosslessMegaCodeTrainingV3_MINI
|
89 |
+
- BelleGroup/multiturn_chat_0.8M
|
90 |
+
- smangrul/code-chat-assistant-v1
|
91 |
language:
|
92 |
- en
|
93 |
- it
|
|
|
97 |
- ru
|
98 |
- ro
|
99 |
- el
|
100 |
+
- ja
|
101 |
+
- ch
|
102 |
+
- zh
|
103 |
metrics:
|
104 |
- accuracy
|
105 |
- bertscore
|
|
|
107 |
- code_eval
|
108 |
- character
|
109 |
- brier_score
|
110 |
+
- cer
|
111 |
+
- chrf
|
112 |
tags:
|
113 |
- code
|
114 |
- text-generation-inference
|
|
|
135 |
|
136 |
### Model Specifications
|
137 |
|
138 |
+
Aiden is a Transformer-based LLM with 175B parameters. It is trained on a massive dataset of text and code, including the following:
|
139 |
|
140 |
* Books
|
141 |
* Code
|