Update README.md
Browse files
README.md
CHANGED
@@ -28,18 +28,18 @@ This is a model that applies LLM2Vec to Llama2. Only the PEFT Adapter is distrib
|
|
28 |
- **Repository:** https://github.com/McGill-NLP/llm2vec
|
29 |
- **Paper:** https://arxiv.org/abs/2404.05961
|
30 |
|
31 |
-
|
32 |
|
33 |
- Please see [original LLM2Vec repo](https://huggingface.co/McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp#usage)
|
34 |
|
35 |
-
|
36 |
|
37 |
-
|
38 |
|
39 |
- [wikitext](https://huggingface.co/datasets/Salesforce/wikitext)
|
40 |
|
41 |
|
42 |
-
|
43 |
- batch_size: 64,
|
44 |
- gradient_accumulation_steps: 1
|
45 |
- max_seq_length": 512,
|
@@ -51,7 +51,7 @@ This is a model that applies LLM2Vec to Llama2. Only the PEFT Adapter is distrib
|
|
51 |
- bf16: true
|
52 |
- gradient_checkpointing: true
|
53 |
|
54 |
-
|
55 |
- deepspeed_config:
|
56 |
- gradient_accumulation_steps: 1
|
57 |
- gradient_clipping: 1.0
|
@@ -77,7 +77,7 @@ This is a model that applies LLM2Vec to Llama2. Only the PEFT Adapter is distrib
|
|
77 |
- quse_cpu: false
|
78 |
|
79 |
|
80 |
-
|
81 |
|
82 |
- Python: 3.12.3
|
83 |
- PEFT 0.11.1
|
|
|
28 |
- **Repository:** https://github.com/McGill-NLP/llm2vec
|
29 |
- **Paper:** https://arxiv.org/abs/2404.05961
|
30 |
|
31 |
+
# Usage
|
32 |
|
33 |
- Please see [original LLM2Vec repo](https://huggingface.co/McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp#usage)
|
34 |
|
35 |
+
# Training Details
|
36 |
|
37 |
+
## Training Data
|
38 |
|
39 |
- [wikitext](https://huggingface.co/datasets/Salesforce/wikitext)
|
40 |
|
41 |
|
42 |
+
## Training Hyperparameter
|
43 |
- batch_size: 64,
|
44 |
- gradient_accumulation_steps: 1
|
45 |
- max_seq_length": 512,
|
|
|
51 |
- bf16: true
|
52 |
- gradient_checkpointing: true
|
53 |
|
54 |
+
## Accelerator Settings
|
55 |
- deepspeed_config:
|
56 |
- gradient_accumulation_steps: 1
|
57 |
- gradient_clipping: 1.0
|
|
|
77 |
- quse_cpu: false
|
78 |
|
79 |
|
80 |
+
## Framework versions
|
81 |
|
82 |
- Python: 3.12.3
|
83 |
- PEFT 0.11.1
|