prithivMLmods
/

FastThink-0.5B-Tiny

Text Generation

text-generation-inference

Model card Files Files and versions

prithivMLmods commited on Jan 24

Commit

fc7cfd9

·

verified ·

1 Parent(s): 67fab0a

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -14,6 +14,14 @@ tags:
 ---
 # **FastThink-1.5B-Tiny**
 # **Dataset Preparation**

 ---
 # **FastThink-1.5B-Tiny**
+FastThink-0.5B-Tiny is a reasoning-focused model based on Qwen2.5. We have released a range of base language models and instruction-tuned language models, spanning from 0.5 billion to 72 billion parameters. Qwen2.5 introduces the following improvements over Qwen2:
+- Significantly enhanced knowledge and greatly improved capabilities in coding and mathematics, thanks to specialized expert models in these domains.
+- Major improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g., tables), and generating structured outputs, especially JSON. It is more resilient to diverse system prompts, enhancing role-play implementation and condition-setting for chatbots.
+- Long-context support for up to 128K tokens and the ability to generate outputs up to 8K tokens.
+- Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
+**Architecture**: Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
 # **Dataset Preparation**