uzabase
/

LLM2Vec-Llama-2-7b-hf-mntp

Model card Files Files and versions Community

h-iida commited on Sep 12, 2024

Commit

e6aaa22

·

verified ·

1 Parent(s): 161bb66

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -28,18 +28,18 @@ This is a model that applies LLM2Vec to Llama2. Only the PEFT Adapter is distrib
 - **Repository:**  https://github.com/McGill-NLP/llm2vec
 - **Paper:** https://arxiv.org/abs/2404.05961
-## Usage
 - Please see [original LLM2Vec repo](https://huggingface.co/McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp#usage)
-## Training Details
-### Training Data
 - [wikitext](https://huggingface.co/datasets/Salesforce/wikitext)
-#### Training Hyperparameter
 - batch_size: 64,
 - gradient_accumulation_steps: 1
 - max_seq_length": 512,
@@ -51,7 +51,7 @@ This is a model that applies LLM2Vec to Llama2. Only the PEFT Adapter is distrib
 - bf16: true
 - gradient_checkpointing: true
-#### Accelerator Settings
 - deepspeed_config:
   - gradient_accumulation_steps: 1
   - gradient_clipping: 1.0
@@ -77,7 +77,7 @@ This is a model that applies LLM2Vec to Llama2. Only the PEFT Adapter is distrib
 - quse_cpu: false
-### Framework versions
 - Python: 3.12.3
 - PEFT 0.11.1

 - **Repository:**  https://github.com/McGill-NLP/llm2vec
 - **Paper:** https://arxiv.org/abs/2404.05961
+# Usage
 - Please see [original LLM2Vec repo](https://huggingface.co/McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp#usage)
+# Training Details
+## Training Data
 - [wikitext](https://huggingface.co/datasets/Salesforce/wikitext)
+## Training Hyperparameter
 - batch_size: 64,
 - gradient_accumulation_steps: 1
 - max_seq_length": 512,
 - bf16: true
 - gradient_checkpointing: true
+## Accelerator Settings
 - deepspeed_config:
   - gradient_accumulation_steps: 1
   - gradient_clipping: 1.0
 - quse_cpu: false
+## Framework versions
 - Python: 3.12.3
 - PEFT 0.11.1