Reasoning Models [RU]
Collection
Collection of reasoning models
•
5 items
•
Updated
•
1
Utilized HF.Accelerator
GPU hours: ~3h of NVIDIA A100
Для обучения использовался HuggingFace Accelerator
GPU часы: ~3 часа NVIDIA A100
GPTR was trained using MyLLM framework (by Attention Signs):
--==MyLLM==--
Full SFT finetuning
[model]
model_name_or_path = "yandex/YandexGPT-5-Lite-8B-pretrain"
[datasets]
dataset = "attn-signs/gromov-0"
conversation_field = "conversation"
generate_eval_examples = false
evaluation_strategy = "steps"
eval_steps = 100
dataloader_num_workers = 2
remove_unused_columns = true
test_size = 0.05
[run]
save_strategy = "steps"
save_steps = 300
save_total_limit = 3
run_name = "sft-gptr-8-run2"
report_to = "wandb"
logging_first_step = true
logging_steps = 1
output_dir = "models/attn-signs-gptr-8-run2"
project_name = "sft-gptr"
[training]
train_only_on_completions = true
per_device_train_batch_size = 1
per_device_eval_batch_size = 1
num_train_epochs = 3
learning_rate = 0.000009
max_seq_length = 8192
gradient_accumulation_steps = 8
gradient_checkpointing = true
warmup_steps = 10
bf16 = true
seed = 42
use_peft = false
[fusion]
attn_implementation = "flash_attention_2"
[tokenizer]
assistant_message_template = "<s>assistant\n"
eos_token = "</s>"
pad_token = "<unk>"
chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<s>' + message['role'] + '\n' + message['content'] + '</s>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<s>assistant\n' }}{% endif %}"
force_chat_template = true
added_special_tokens = [
"<think>",
"</think>"
]
system_prompt = """
[MODE: Reflection]
"""
repo = 'attn-signs/GPTR-8-base'
model = AutoModelForCausalLM.from_pretrained(repo)
tokenizer = AutoTokenizer.from_pretrained(repo)
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model.to(device)
user_prompt = '''
У уравнений x**2 + 2019ax + b = 0 и x**2 + 2019bx + a = 0 есть один общий корень. Чему может быть равен этот корень, если известно, что a != b?
'''
system_prompt = "[MODE: Reflection]"
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=4096
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
Base model
yandex/YandexGPT-5-Lite-8B-pretrain