victormiller commited on
Commit
998f2fd
·
verified ·
1 Parent(s): e928084

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -3
README.md CHANGED
@@ -1,16 +1,31 @@
1
  ---
2
  license: apache-2.0
3
  ---
4
- # K2-Chat: a fully-reproducible large language model outperforming Llama 2 70B using 35% less compute
5
 
6
- blurb
7
 
8
  <center><img src="k2_chat_eval_table.png" alt="k2 eval table" /></center>
9
 
 
 
10
 
 
11
 
12
  <center><img src="k2_chat_table_of_tables.png" alt="k2 big eval table"/></center>
13
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ## Loading K2-Chat
15
  ```python
16
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -25,4 +40,16 @@ gen_tokens = model.generate(input_ids, do_sample=True, max_new_tokens=128)
25
 
26
  print("-"*20 + "Output for model" + 20 * '-')
27
  print(tokenizer.batch_decode(gen_tokens)[0])
28
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # K2-Chat: a fully-reproducible large language model outperforming Llama 2 70B Chat using 35% less compute
5
 
6
+ K2 Chat is finetuned from [K2-65B](https://huggingface.co/LLM360/K2). K2 Chat outperforms Llama 2-70B-Chat on all evaluations conducted. The model also outperforms Llama 3-70B-Instruct on coding tasks.
7
 
8
  <center><img src="k2_chat_eval_table.png" alt="k2 eval table" /></center>
9
 
10
+ ## LLM360 Model Performance and Evaluation Collection
11
+ The LLM360 Performance and Evaluation Collection is a robust evaluations set consisting of general and domain specific evaluations to assess model knowledge and function.
12
 
13
+ Evaluations include standard best practice benchmarks, medical, math, and coding knowledge. More about the evaluations can be found here.
14
 
15
  <center><img src="k2_chat_table_of_tables.png" alt="k2 big eval table"/></center>
16
 
17
+ ## Datasets and Mix
18
+
19
+ | Subset | #Tokens | Avg. #Q | Avg. Query Len | Avg. #R | Avg. Reply Len |
20
+ | ----------- | ----------- |----------- |----------- |----------- |----------- |
21
+ | [MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct) | 66,639,699 | 1.00 | 81.53 | 1.00 | 172.78 |
22
+ | [OpenHermes-2](https://huggingface.co/datasets/teknium/OpenHermes-2.5) |404,820,694 | 1.01 | 152.38 | 1.01 | 249.12 |
23
+ | FLAN_3M | 2,346,961,387 | 1.00 | 727.49 | 1.00 | 54.83 |
24
+ | [Standford Encyclopedia Philosophy](https://huggingface.co/datasets/AiresPucrs/stanford-encyclopedia-philosophy) | 786,928 | 1.00 | 219.09 | 1.00 | 166.28 |
25
+ | [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) | 1,448,898 | 1.00 | 260.82 | 1.00 | 207.47 |
26
+ | Safety & Alignment Data | 99,976,621 | 1.00 | 126.71 | 1.00 | 373.79 |
27
+ | Total | 2,920,634,227
28
+
29
  ## Loading K2-Chat
30
  ```python
31
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
40
 
41
  print("-"*20 + "Output for model" + 20 * '-')
42
  print(tokenizer.batch_decode(gen_tokens)[0])
43
+ ```
44
+
45
+ ## LLM360 Developer Suite
46
+ We provide step-by-step finetuning tutorials for tech enthusiasts, AI practitioners and academic or industry researchers [here](llm360.ai/development).
47
+
48
+ ## About LLM360
49
+ LLM360 is an open research lab enabling community-owned AGI through open-source large model research and development.
50
+
51
+ LLM360 enables community-owned AGI by creating standards and tools to advance the bleeding edge of LLM capability and empower knowledge transfer, research, and development.
52
+
53
+ We believe in a future where artificial general intelligence (AGI) is created by the community, for the community. Through an open ecosystem of equitable computational resources, high quality data, and flowing technical knowledge, we can ensure ethical AGI development and universal access for all innovators.
54
+
55
+ [Visit us](llm360.ai)