tomasmcm commited on
Commit
d336471
·
verified ·
1 Parent(s): 53e4634

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -10,12 +10,14 @@ tags:
10
  - qwen2
11
  license: apache-2.0
12
  ---
13
- # merge
14
 
15
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
 
17
  I wanted to see if it would be possible to improve on [FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview) and [CoderO1-DeepSeekR1-Coder-32B-Preview](https://huggingface.co/RDson/CoderO1-DeepSeekR1-Coder-32B-Preview) by using [Sky-T1-32B-Flash](https://huggingface.co/NovaSky-AI/Sky-T1-32B-Flash) as the reasoning model that is merged with [Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) instead of DeepSeek-R1-Distill-Qwen-32B. The idea is to have a strong coder model that can reason but without very long reasoning chains (hence using the Flash model).
18
 
 
 
19
  ## Merge Details
20
  ### Merge Method
21
 
 
10
  - qwen2
11
  license: apache-2.0
12
  ---
13
+ # tomasmcm/sky-t1-coder-32b-flash
14
 
15
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
 
17
  I wanted to see if it would be possible to improve on [FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview](https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview) and [CoderO1-DeepSeekR1-Coder-32B-Preview](https://huggingface.co/RDson/CoderO1-DeepSeekR1-Coder-32B-Preview) by using [Sky-T1-32B-Flash](https://huggingface.co/NovaSky-AI/Sky-T1-32B-Flash) as the reasoning model that is merged with [Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) instead of DeepSeek-R1-Distill-Qwen-32B. The idea is to have a strong coder model that can reason but without very long reasoning chains (hence using the Flash model).
18
 
19
+ GGUF files available at [mradermacher/sky-t1-coder-32b-flash-GGUF](https://huggingface.co/mradermacher/sky-t1-coder-32b-flash-GGUF) (thank you!)
20
+
21
  ## Merge Details
22
  ### Merge Method
23