Update README.md
Browse files
README.md
CHANGED
@@ -20,12 +20,18 @@ Parts:
|
|
20 |
|
21 |
## Training
|
22 |
|
|
|
|
|
|
|
|
|
|
|
23 |
- [(Private) W&B run](https://wandb.ai/scottlogic/llm-stepwise/runs/nvdyo6aw?workspace=user-birchlabs)
|
24 |
- [(Public) W&B report](https://api.wandb.ai/links/scottlogic/65wo5d2o)
|
25 |
|
26 |
## Usage
|
27 |
|
28 |
-
You can load using [`evaluate.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/evaluate.py#L209-L278) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
|
|
29 |
|
30 |
Download `embed_tokens.pt` and `lm_head.pt` from [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings/tree/main), then run evaluator like so:
|
31 |
|
|
|
20 |
|
21 |
## Training
|
22 |
|
23 |
+
Trained using [`qlora.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/qlora.py) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
24 |
+
Known-good as of commit [`522d86b`](https://github.com/scottlogic-alex/qlora/blob/522d86b447d9fe85e99ece33141fb37c4e947cda/qlora.py).
|
25 |
+
|
26 |
+
`python -m qlora --model_name_or_path huggyllama/llama-13b --lora_name_or_path chansung/alpaca-lora-13b --dataset prm800k-solutions --dataset_format prm800k-solutions --bf16 --max_memory_MB 24000 --use_bos_token_in_prompt --truncate_toward_center --source_max_len 184 --target_max_len 998 --gradient_accumulation_steps 4 --per_device_train_batch_size 4 --per_device_eval_batch_size 4 --learning_rate 0.0002 --run_name 13b_alpaca_special_tokens_long --report_to wandb --save_steps 64 --save_total_limit 3 --max_steps 1664 --evaluation_strategy steps --eval_steps 64 --generate_steps 16 --register_process_supervision_tokens`
|
27 |
+
|
28 |
- [(Private) W&B run](https://wandb.ai/scottlogic/llm-stepwise/runs/nvdyo6aw?workspace=user-birchlabs)
|
29 |
- [(Public) W&B report](https://api.wandb.ai/links/scottlogic/65wo5d2o)
|
30 |
|
31 |
## Usage
|
32 |
|
33 |
+
You can load using [`evaluate.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/evaluate.py#L209-L278) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
34 |
+
Known-good as of commit [`522d86b`](https://github.com/scottlogic-alex/qlora/blob/522d86b447d9fe85e99ece33141fb37c4e947cda/evaluate.py).
|
35 |
|
36 |
Download `embed_tokens.pt` and `lm_head.pt` from [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings/tree/main), then run evaluator like so:
|
37 |
|