prithivMLmods
/

Bellatrix-Tiny-3B-R1

Text Generation

Reinforcement learning

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

prithivMLmods commited on 1 day ago

Commit

7813dfb

·

verified ·

1 Parent(s): f539049

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -9,6 +9,7 @@ tags:
 - trl
 - llama3.2
 - Reinforcement learning
 ---
 # **Bellatrix-Tiny-3B-R1**
@@ -65,4 +66,4 @@ Despite its capabilities, Bellatrix has some limitations:
 2. **Dependence on Training Data**: It is only as good as the quality and diversity of its training data, which may lead to biases or inaccuracies.
 3. **Computational Resources**: The model’s optimized transformer architecture can be resource-intensive, requiring significant computational power for fine-tuning and inference.
 4. **Language Coverage**: While multilingual, some languages or dialects may have limited support or lower performance compared to widely used ones.
-5. **Real-World Contexts**: It may struggle with understanding nuanced or ambiguous real-world scenarios not covered during training.

 - trl
 - llama3.2
 - Reinforcement learning
+- SFT
 ---
 # **Bellatrix-Tiny-3B-R1**
 2. **Dependence on Training Data**: It is only as good as the quality and diversity of its training data, which may lead to biases or inaccuracies.
 3. **Computational Resources**: The model’s optimized transformer architecture can be resource-intensive, requiring significant computational power for fine-tuning and inference.
 4. **Language Coverage**: While multilingual, some languages or dialects may have limited support or lower performance compared to widely used ones.
+5. **Real-World Contexts**: It may struggle with understanding nuanced or ambiguous real-world scenarios not covered during training.