Carlos Rosas commited on
Commit
ce09740
·
verified ·
1 Parent(s): 9ceba9b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -25
README.md CHANGED
@@ -1,42 +1,41 @@
 
 
1
  Cassandre-RAG is a fine-tuned llama-3.1-8b model, built for RAG on French administrative documents, with a focus on sources from school administration.
2
 
3
  ## Training
4
 
5
  The model was trained on a H100, using these parameters:
6
 
7
- Training Hyperparameters
8
-
9
- Max Steps: 3000
10
- Learning Rate: 3e-4
11
- Batch Size: 2 per device
12
- Gradient Accumulation Steps: 4
13
- Max Sequence Length: 8192
14
- Weight Decay: 0.001
15
- Warmup Ratio: 0.03
16
- LR Scheduler: Linear
17
- Optimizer: paged_adamw_32bit
18
-
19
- LoRA Configuration
20
-
21
- LoRA Alpha: 16
22
- LoRA Dropout: 0.1
23
- LoRA R: 64
24
- Target Modules: ["gate_proj", "down_proj", "up_proj", "q_proj", "v_proj", "k_proj", "o_proj"]
25
-
26
- Quantization
27
-
28
- Quantization: 4-bit
29
- Quantization Type: nf4
30
- Compute Dtype: float16
31
 
32
  ## Usage
33
 
34
  Cassandre-RAG uses a custom syntax for parsing sources and generating sourced output.
35
-
36
  Each source should be preceded by an ID encapsulated in double asterisks (e.g., **SOURCE_ID**).
37
 
38
  ### Example Usage
39
 
 
40
  import pandas as pd
41
  from vllm import LLM, SamplingParams
42
 
 
1
+ # Cassandre-RAG
2
+
3
  Cassandre-RAG is a fine-tuned llama-3.1-8b model, built for RAG on French administrative documents, with a focus on sources from school administration.
4
 
5
  ## Training
6
 
7
  The model was trained on a H100, using these parameters:
8
 
9
+ ### Training Hyperparameters
10
+ - Max Steps: 3000
11
+ - Learning Rate: 3e-4
12
+ - Batch Size: 2 per device
13
+ - Gradient Accumulation Steps: 4
14
+ - Max Sequence Length: 8192
15
+ - Weight Decay: 0.001
16
+ - Warmup Ratio: 0.03
17
+ - LR Scheduler: Linear
18
+ - Optimizer: paged_adamw_32bit
19
+
20
+ ### LoRA Configuration
21
+ - LoRA Alpha: 16
22
+ - LoRA Dropout: 0.1
23
+ - LoRA R: 64
24
+ - Target Modules: ["gate_proj", "down_proj", "up_proj", "q_proj", "v_proj", "k_proj", "o_proj"]
25
+
26
+ ### Quantization
27
+ - Quantization: 4-bit
28
+ - Quantization Type: nf4
29
+ - Compute Dtype: float16
 
 
 
30
 
31
  ## Usage
32
 
33
  Cassandre-RAG uses a custom syntax for parsing sources and generating sourced output.
 
34
  Each source should be preceded by an ID encapsulated in double asterisks (e.g., **SOURCE_ID**).
35
 
36
  ### Example Usage
37
 
38
+ ```python
39
  import pandas as pd
40
  from vllm import LLM, SamplingParams
41