Model Card for Llama-3.3-Argunaut-1-70B-SFT
This model is a fine-tuned version of meta-llama/Llama-3.3-70B-Instruct. It has been trained using TRL.
Quick start
from transformers import pipeline
question = "Are you familiar with Argdown syntax? What's its purpose?"
generator = pipeline("text-generation", model="DebateLabKIT/Llama-3.1-Argunaut-1-8B-SFT", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
SFT dataset mixture
Dataset | Weight (examples) | Weight (tokens) |
---|---|---|
DebateLabKIT/deepa2-conversations | 25% | 49% |
DebateLabKIT/deep-argmap-conversations | 25% | 18% |
allenai/tulu-3-sft-mixture | 50% | 33% |
Training procedure
Trained with SFT on 1M examples and for 1 epoch with
- context length 8196
- packing (trl implementation)
- spectrum (top 30 percent)
# Training parameters
num_train_epochs: 1
per_device_train_batch_size: 2
gradient_accumulation_steps: 8
gradient_checkpointing: true
gradient_checkpointing_kwargs:
use_reentrant: false
learning_rate: 2.0e-6 # following _Tülu 3_ recipe
lr_scheduler_type: cosine
warmup_ratio: 0.1
Hardware: 4 x H100 GPUs.
This work was performed on the HoreKa supercomputer funded by the Ministry of Science, Research and the Arts Baden-Württemberg and by the Federal Ministry of Education and Research.
Framework versions
- TRL: 0.12.1
- Transformers: 4.46.3
- Pytorch: 2.4.1
- Datasets: 3.1.0
- Tokenizers: 0.20.3
Credits
This work wouldn't be possible without all the great contributions from the open LLM community. Thank you! Special kudos go to
- @philschmid for his latest fine-tuning boilerplate
- @lvwerra, @lewtun et al for building and maintaining trl
- @cognitivecomputations for sharing spectrum
- @allenai for the Tülu recipe and artifacts
- Downloads last month
- 179
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.