prithivMLmods commited on
Commit
fc7cfd9
·
verified ·
1 Parent(s): 67fab0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -14,6 +14,14 @@ tags:
14
  ---
15
  # **FastThink-1.5B-Tiny**
16
 
 
 
 
 
 
 
 
 
17
 
18
  # **Dataset Preparation**
19
 
 
14
  ---
15
  # **FastThink-1.5B-Tiny**
16
 
17
+ FastThink-0.5B-Tiny is a reasoning-focused model based on Qwen2.5. We have released a range of base language models and instruction-tuned language models, spanning from 0.5 billion to 72 billion parameters. Qwen2.5 introduces the following improvements over Qwen2:
18
+
19
+ - Significantly enhanced knowledge and greatly improved capabilities in coding and mathematics, thanks to specialized expert models in these domains.
20
+ - Major improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g., tables), and generating structured outputs, especially JSON. It is more resilient to diverse system prompts, enhancing role-play implementation and condition-setting for chatbots.
21
+ - Long-context support for up to 128K tokens and the ability to generate outputs up to 8K tokens.
22
+ - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
23
+
24
+ **Architecture**: Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
25
 
26
  # **Dataset Preparation**
27