|
--- |
|
language: |
|
- tr |
|
pipeline_tag: text-generation |
|
tags: |
|
- llama |
|
- smollm |
|
- turkish |
|
- text-generation-inference |
|
--- |
|
|
|
# smollm-turkish-base |
|
|
|
Turkish base model with early stopped training |
|
|
|
## Model Description |
|
|
|
- **Model Type:** LLaMA Architecture |
|
- **Training Framework:** Nanotron |
|
- **Base Tokenizer:** bonur/gpt2-turkish-tokenizer |
|
- **Context Length:** 4096 |
|
- **Vocab Size:** 52000 |
|
- **Hidden Size:** 576 |
|
- **Number of Layers:** 30 |
|
- **Number of Attention Heads:** 9 |
|
- **Number of Key/Value Heads:** 3 |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model = AutoModelForCausalLM.from_pretrained("bonur/smollm-turkish-base") |
|
tokenizer = AutoTokenizer.from_pretrained("bonur/smollm-turkish-base") |
|
|
|
text = "Your prompt here" |
|
inputs = tokenizer(text, return_tensors="pt", padding=True) |
|
outputs = model.generate( |
|
inputs.input_ids, |
|
attention_mask=inputs.attention_mask, |
|
max_new_tokens=100, |
|
do_sample=True, |
|
temperature=0.7, |
|
top_p=0.9 |
|
) |
|
result = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(result) |
|
``` |