aleynahukmet commited on
Commit
2716c8c
1 Parent(s): 3c7a0f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -25,7 +25,7 @@ from transformers import TextStreamer
25
  max_seq_length = 2048
26
  dtype = None
27
  load_in_4bit = False
28
- lora_path = "altaidevorg/gemma-women-health-checkpoint-1292"
29
  use_streamer = False
30
 
31
  model, tokenizer = FastLanguageModel.from_pretrained(
@@ -74,7 +74,7 @@ The dataset will be made publicly available through its dedicated repository [al
74
 
75
  ## Evaluation Notes
76
 
77
- During testing, we observed that the LoRA checkpoint performed better in evaluations compared to the version where the LoRA checkpoint was [merged with the base model](https://huggingface.co/altaidevorg/gemma-women-health-merged). Interestingly, the standalone LoRA checkpoint consistently delivered superior results, though we currently lack a concrete explanation for this phenomenon.
78
  We are actively investigating the underlying cause, with our best hypothesis being that the merging process may introduce some form of precision loss. Further research is underway to validate this theory and optimize the performance.
79
 
80
  ### Disclaimer
 
25
  max_seq_length = 2048
26
  dtype = None
27
  load_in_4bit = False
28
+ lora_path = "altaidevorg/gemma-women-health-merged"
29
  use_streamer = False
30
 
31
  model, tokenizer = FastLanguageModel.from_pretrained(
 
74
 
75
  ## Evaluation Notes
76
 
77
+ During testing, we observed that the [LoRA checkpoint](https://huggingface.co/altaidevorg/gemma-women-health-checkpoint-1292) performed better in evaluations compared to the version where the LoRA checkpoint was merged with the base model. Interestingly, the standalone LoRA checkpoint consistently delivered superior results, though we currently lack a concrete explanation for this phenomenon.
78
  We are actively investigating the underlying cause, with our best hypothesis being that the merging process may introduce some form of precision loss. Further research is underway to validate this theory and optimize the performance.
79
 
80
  ### Disclaimer