Carlos Rosas commited on
Commit
9ceba9b
1 Parent(s): 4dbd20e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -4
README.md CHANGED
@@ -1,6 +1,77 @@
1
- Cassandre-RAG is a fine-tuned llama-3.1 model for RAG on administrative sources in France, especially in regards to school administration.
2
 
3
- ## Use
4
- Cassandre-RAG relies on a custom syntax to parse sources and generate sourced output.
5
 
6
- Each source has to be preceded by an id (can be anything) encapsulated into "**".
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Cassandre-RAG is a fine-tuned llama-3.1-8b model, built for RAG on French administrative documents, with a focus on sources from school administration.
2
 
3
+ ## Training
 
4
 
5
+ The model was trained on a H100, using these parameters:
6
+
7
+ Training Hyperparameters
8
+
9
+ Max Steps: 3000
10
+ Learning Rate: 3e-4
11
+ Batch Size: 2 per device
12
+ Gradient Accumulation Steps: 4
13
+ Max Sequence Length: 8192
14
+ Weight Decay: 0.001
15
+ Warmup Ratio: 0.03
16
+ LR Scheduler: Linear
17
+ Optimizer: paged_adamw_32bit
18
+
19
+ LoRA Configuration
20
+
21
+ LoRA Alpha: 16
22
+ LoRA Dropout: 0.1
23
+ LoRA R: 64
24
+ Target Modules: ["gate_proj", "down_proj", "up_proj", "q_proj", "v_proj", "k_proj", "o_proj"]
25
+
26
+ Quantization
27
+
28
+ Quantization: 4-bit
29
+ Quantization Type: nf4
30
+ Compute Dtype: float16
31
+
32
+ ## Usage
33
+
34
+ Cassandre-RAG uses a custom syntax for parsing sources and generating sourced output.
35
+
36
+ Each source should be preceded by an ID encapsulated in double asterisks (e.g., **SOURCE_ID**).
37
+
38
+ ### Example Usage
39
+
40
+ import pandas as pd
41
+ from vllm import LLM, SamplingParams
42
+
43
+ # Load the model
44
+ model_name = "PleIAs/Cassandre-RAG"
45
+ llm = LLM(model_name, max_model_len=8128)
46
+
47
+ # Set sampling parameters
48
+ sampling_params = SamplingParams(
49
+ temperature=0.7,
50
+ top_p=0.95,
51
+ max_tokens=3000,
52
+ presence_penalty=1.2,
53
+ stop=["#END#"]
54
+ )
55
+
56
+ # Prepare the input data
57
+ def prepare_prompt(query, sources):
58
+ sources_text = "\n\n".join([f"**{src_id}**\n{content}" for src_id, content in sources])
59
+ return f"### Query ###\n{query}\n\n### Source ###\n{sources_text}\n\n### Analysis ###\n"
60
+
61
+ # Example query and sources
62
+ query = "Quelles sont les procédures pour inscrire un enfant à l'école primaire?"
63
+ sources = [
64
+ ("SOURCE_001", "L'inscription à l'école primaire se fait généralement à la mairie..."),
65
+ ("SOURCE_002", "Les documents nécessaires pour l'inscription scolaire incluent..."),
66
+ ]
67
+
68
+ # Prepare the prompt
69
+ prompt = prepare_prompt(query, sources)
70
+
71
+ # Generate the response
72
+ outputs = llm.generate([prompt], sampling_params)
73
+ generated_text = outputs[0].outputs[0].text
74
+
75
+ print("Query:", query)
76
+ print("\nGenerated Response:")
77
+ print(generated_text)