Multiple LoRA Serving
Collection
Multiple LoRA Serving
β’
18 items
β’
Updated
[17:56:13] INFO Starting PEFT model merge script push_3.py:12
INFO Base model: meta-llama/Meta-Llama-3.1-8B-Instruct push_3.py:19
INFO PEFT model: yuriachermann/Not-so-bright-AGI-Llama3.1-8B-UC200k-v2 push_3.py:20
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:00<00:00, 11.33it/s]
[17:56:15] INFO Total parameters in base model: 8030261248 push_3.py:32
INFO PEFT configuration: LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path='VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct', push_3.py:36
revision=None, task_type='CAUSAL_LM', inference_mode=True, r=128, target_modules={'v_proj', 'q_proj'}, exclude_modules=None, lora_alpha=32, lora_dropout=0.05,
fan_in_fan_out=False, bias='none', use_rslora=False, modules_to_save=None, init_lora_weights=True, layers_to_transform=None, layers_pattern=None, rank_pattern={},
alpha_pattern={}, megatron_config=None, megatron_core='megatron.core', loftq_config={}, use_dora=False, layer_replication=None,
runtime_config=LoraRuntimeConfig(ephemeral_gpu_offload=False))
INFO Reduction factor (r): 128 push_3.py:37
INFO Target modules: {'v_proj', 'q_proj'} push_3.py:38
[17:56:17] INFO Total parameters in combined model (PEFT + base): 8084787200 push_3.py:42
[17:56:18] INFO PEFT model merged into base model. push_3.py:46
INFO Parameters after merging: 8030261248 (should match base model: 8030261248) push_3.py:50