PleIAs
/

Cassandre-RAG

Model card Files Files and versions Community

Carlos Rosas commited on Sep 24, 2024

Commit

ce09740

·

verified ·

1 Parent(s): 9ceba9b

Update README.md

Files changed (1) hide show

README.md +24 -25

README.md CHANGED Viewed

@@ -1,42 +1,41 @@
 Cassandre-RAG is a fine-tuned llama-3.1-8b model, built for RAG on French administrative documents, with a focus on sources from school administration.
 ## Training
 The model was trained on a H100, using these parameters:
-Training Hyperparameters
-Max Steps: 3000
-Learning Rate: 3e-4
-Batch Size: 2 per device
-Gradient Accumulation Steps: 4
-Max Sequence Length: 8192
-Weight Decay: 0.001
-Warmup Ratio: 0.03
-LR Scheduler: Linear
-Optimizer: paged_adamw_32bit
-LoRA Configuration
-LoRA Alpha: 16
-LoRA Dropout: 0.1
-LoRA R: 64
-Target Modules: ["gate_proj", "down_proj", "up_proj", "q_proj", "v_proj", "k_proj", "o_proj"]
-Quantization
-Quantization: 4-bit
-Quantization Type: nf4
-Compute Dtype: float16
 ## Usage
 Cassandre-RAG uses a custom syntax for parsing sources and generating sourced output.
 Each source should be preceded by an ID encapsulated in double asterisks (e.g., **SOURCE_ID**).
 ### Example Usage
 import pandas as pd
 from vllm import LLM, SamplingParams

+# Cassandre-RAG
 Cassandre-RAG is a fine-tuned llama-3.1-8b model, built for RAG on French administrative documents, with a focus on sources from school administration.
 ## Training
 The model was trained on a H100, using these parameters:
+### Training Hyperparameters
+- Max Steps: 3000
+- Learning Rate: 3e-4
+- Batch Size: 2 per device
+- Gradient Accumulation Steps: 4
+- Max Sequence Length: 8192
+- Weight Decay: 0.001
+- Warmup Ratio: 0.03
+- LR Scheduler: Linear
+- Optimizer: paged_adamw_32bit
+### LoRA Configuration
+- LoRA Alpha: 16
+- LoRA Dropout: 0.1
+- LoRA R: 64
+- Target Modules: ["gate_proj", "down_proj", "up_proj", "q_proj", "v_proj", "k_proj", "o_proj"]
+### Quantization
+- Quantization: 4-bit
+- Quantization Type: nf4
+- Compute Dtype: float16
 ## Usage
 Cassandre-RAG uses a custom syntax for parsing sources and generating sourced output.
 Each source should be preceded by an ID encapsulated in double asterisks (e.g., **SOURCE_ID**).
 ### Example Usage
+```python
 import pandas as pd
 from vllm import LLM, SamplingParams