Multiple LoRA Serving
Collection
Multiple LoRA Serving
β’
18 items
β’
Updated
[17:55:49] INFO Starting PEFT model merge script push_2.py:12
INFO Base model: meta-llama/Meta-Llama-3.1-8B-Instruct push_2.py:19
INFO PEFT model: monsterapi/Llama-3_1-8B-Instruct-orca-ORPO push_2.py:20
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:00<00:00, 8.55it/s]
[17:55:50] INFO Total parameters in base model: 8030261248 push_2.py:32
INFO PEFT configuration: LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path='meta-llama/Meta-Llama-3.1-8B-Instruct', revision=None, push_2.py:36
task_type='CAUSAL_LM', inference_mode=True, r=32, target_modules={'q_proj', 'v_proj', 'o_proj', 'k_proj'}, exclude_modules=None, lora_alpha=64, lora_dropout=0.0,
fan_in_fan_out=False, bias='none', use_rslora=False, modules_to_save=None, init_lora_weights=True, layers_to_transform=None, layers_pattern=None, rank_pattern={},
alpha_pattern={}, megatron_config=None, megatron_core='megatron.core', loftq_config={}, use_dora=False, layer_replication=None,
runtime_config=LoraRuntimeConfig(ephemeral_gpu_offload=False))
INFO Reduction factor (r): 32 push_2.py:37
INFO Target modules: {'q_proj', 'v_proj', 'o_proj', 'k_proj'} push_2.py:38
[17:55:52] INFO Total parameters in combined model (PEFT + base): 8057524224 push_2.py:42
[17:56:02] INFO PEFT model merged into base model. push_2.py:46
INFO Parameters after merging: 8030261248 (should match base model: 8030261248) push_2.py:50