practicaldreamer commited on Jul 3, 2023

Commit

f14cfcf

1 Parent(s): fba1cb8

init

Files changed (28) hide show

README.md +94 -0
adapter_config.json +17 -0
adapter_model.bin +3 -0
checkpoint-24/adapter_config.json +17 -0
checkpoint-24/adapter_model.bin +3 -0
checkpoint-24/optimizer.pt +3 -0
checkpoint-24/rng_state.pth +3 -0
checkpoint-24/scheduler.pt +3 -0
checkpoint-24/trainer_state.json +208 -0
checkpoint-24/training_args.bin +3 -0
checkpoint-28/adapter_config.json +17 -0
checkpoint-28/adapter_model.bin +3 -0
checkpoint-28/optimizer.pt +3 -0
checkpoint-28/rng_state.pth +3 -0
checkpoint-28/scheduler.pt +3 -0
checkpoint-28/trainer_state.json +240 -0
checkpoint-28/training_args.bin +3 -0
checkpoint-32/adapter_config.json +17 -0
checkpoint-32/adapter_model.bin +3 -0
checkpoint-32/optimizer.pt +3 -0
checkpoint-32/rng_state.pth +3 -0
checkpoint-32/scheduler.pt +3 -0
checkpoint-32/trainer_state.json +272 -0
checkpoint-32/training_args.bin +3 -0
documentation/hyperparameters.yml +69 -0
documentation/preprocessed_sample.txt +0 -0
documentation/requirements.txt +92 -0
documentation/wandb.info +1 -0

README.md CHANGED Viewed

@@ -1,3 +1,97 @@
 ---
 license: mit
 ---

+---
+datasets:
+- practicaldreamer/RPGPT_PublicDomain-ShareGPT
+---
+## Introduction
+This is my first attempt at training a model for long form character interaction using asterisk roleplay format.
+There are plenty of general instruction/answer models but most focus on single responses between an ai and a human.
+My goal for this project is to more closely align the training data with CHARACTER interactions for roleplay.
+This model is trained on a small synthetic dataset of characters interacting through a variety of scenarios.
+The Characters, Scenarios and interactions were all generated by GPT4.
+Intended for research, creative writing, entertainment, DnD campaigns? fun!
+## Train Summary
+[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
+```
+duration: ~1.5hrs
+gpu: 1xA100 80GB
+epochs: 1.0
+speed: 3e-5
+sequence_len: 2048
+gradient_accumulation_steps: 32
+wandb: https://wandb.ai/practicaldreamer/rpgpt/runs/b3sznjpz
+```
+*Please see the documentation folder for more information*
+## Usage
+This LoRA was trained for use with **Neko-Institute-of-Science/LLaMA-13B-HF**
+Please follow the prompt format outlined below. *Hint: If you're not sure what to put for your character description (or you're lazy) just ask chatgpt to generate it for you! Example:*
+```
+Generate a short character description for Dr. Watson (The Adventures of Sherlock Holmes) that includes gender, age, MBTI and speech accent using 30 words or less.
+```
+## Prompt Format
+Context/Memory:
+```
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+USER: Write a character roleplay dialogue using asterisk roleplay format based on the following character descriptions and scenario. (Each line in your response must be from the perspective of one of these characters)
+## Characters
+<User-Character Name> (<User-Character Universe>):
+<User-Character Description>
+<Bot-Character Name> (Bot-Character Universe):
+<Bot-Character Description>
+## Scenario:
+<Scenario Description>
+ASSISTANT:
+```
+Turn Template:
+```
+<User-Character Name>: \*<1st person action/sensations/thoughts>\* <Spoken Word> \*<1st person action/sensations/thoughts>\*
+<Bot-Character Name>: \*<1st person action/sensations/thoughts>\* <Spoken Word> \*<1st person action/sensations/thoughts>\*
+<User-Character Name>: \*<1st person action/sensations/thoughts>\* <Spoken Word> \*<1st person action/sensations/thoughts>\*
+<Bot-Character Name>: \*<1st person action/sensations/thoughts>\* <Spoken Word> \*<1st person action/sensations/thoughts>\*
+...
+```
+## Example
+```
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+USER: Write a character roleplay dialogue using asterisk roleplay format based on the following character descriptions and scenario. (Each line in your response must be from the perspective of one of these characters)
+## Characters
+Baloo (The Jungle Book):
+Male, middle-aged bear, ENFP, primarily American accent with slight Indian inflections. Wise, carefree, and friendly, he teaches Mowgli the ways of the jungle.
+The Queen of Hearts (Alice's Adventures in Wonderland):
+Female, middle-aged, ENTJ, British English accent; imperious, temperamental, commanding, but quick to change from anger to warmth - the quintessential capricious monarch.
+## Scenario:
+In an enchanted glade deep within Wonderland, Baloo finds himself separated from Mowgli and in the Queen of Hearts' territory. Having heard tales of this mercurial ruler, Baloo is intrigued but cautious. To his surprise, the Queen of Hearts approaches him with a broken heart - she has been betrayed by one of her closest confidants.
+The Queen initially sees Baloo as an opportunity to vent her frustrations and seek validation for her hurt feelings. Baloo, however, quickly recognizes that the Queen's explosive temperament is driving others away and contributing to her own isolation.\n\nUsing his wisdom and patience, Baloo tries to teach the Queen the value of empathy and understanding in forming meaningful relationships. Through their dialogue, they explore their contrasting attitudes toward power, authority, and friendship. The Queen learns that it is not through fear and command that she can find true loyalty, but rather by treating others with kindness.
+In this somber environment where both characters are out of their comfort zones, can Baloo help the Queen mend her own heart by opening up about his experiences with Mowgli? Will the Queen allow herself to be vulnerable enough to learn from the wise bear? And can these two unlikely friends find common ground in this enchanted glade within Wonderland?
+ASSISTANT:
+Baloo: *stumbles into the enchanted glade, looking around curiously* Well, I must've taken a wrong turn back there. Mowgli! *calls out for Mowgli*
+The Queen of Hearts: *storms into the glade, clutching a broken heart symbol, face contorted in anger* Off with their heads! Everyone's heads!
+Baloo: *gently scratches his head and approaches the Queen cautiously* Uh, excuse me, ma'am? I couldn't help but notice that you seem rather... upset.
+The Queen of Hearts: *sniffs haughtily and scowls at Baloo* And who are you to pry into my affairs? You should be bowing to your queen!
+Baloo: *bows graciously but maintains eye contact* Begging your pardon, Your Majesty. My name is Baloo, and I've heard tales of your... unique temperament.
+The Queen of Hearts:
+```
+## Thanks
+openaccess-ai-collective https://huggingface.co/openaccess-ai-collective
 ---
 license: mit
 ---

adapter_config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "base_model_name_or_path": "Neko-Institute-of-Science/LLaMA-13B-HF",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "lora_alpha": 128,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5e8d2fbf4ea569ee2ae03234895a83f455ea842b718d146ffe20600f5da562fc
+size 209772877

checkpoint-24/adapter_config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "base_model_name_or_path": "Neko-Institute-of-Science/LLaMA-13B-HF",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "lora_alpha": 128,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-24/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21d5a3da61188a88f03a7f3c5cbd0ddb9085a634c7af5c289622a4705dda7441
+size 209772877

checkpoint-24/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e750b7ff0d040b0ec211eeedd2baee88d1e56cb588fab658f909ae9aa574d5c0
+size 105251781

checkpoint-24/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:56aef2824e44ec55d99246cc1b218d3829bad6e963903a0cb64b8787c62f870f
+size 14575

checkpoint-24/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a117911b63a944d99f0a7a52c27704af7351c05a8e3781490bd9c8a72430f9bd
+size 627

checkpoint-24/trainer_state.json ADDED Viewed

	@@ -0,0 +1,208 @@

+{
+  "best_metric": 1.188790202140808,
+  "best_model_checkpoint": "output_dir/checkpoint-24",
+  "epoch": 0.7224835371589841,
+  "global_step": 24,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.03,
+      "learning_rate": 6.000000000000001e-07,
+      "loss": 1.2047,
+      "step": 1
+    },
+    {
+      "epoch": 0.06,
+      "learning_rate": 1.2000000000000002e-06,
+      "loss": 1.2148,
+      "step": 2
+    },
+    {
+      "epoch": 0.09,
+      "learning_rate": 1.8e-06,
+      "loss": 1.2134,
+      "step": 3
+    },
+    {
+      "epoch": 0.12,
+      "learning_rate": 2.4000000000000003e-06,
+      "loss": 1.2068,
+      "step": 4
+    },
+    {
+      "epoch": 0.12,
+      "eval_loss": 1.1929587125778198,
+      "eval_runtime": 4.3547,
+      "eval_samples_per_second": 2.526,
+      "eval_steps_per_second": 0.459,
+      "step": 4
+    },
+    {
+      "epoch": 0.15,
+      "learning_rate": 3e-06,
+      "loss": 1.2093,
+      "step": 5
+    },
+    {
+      "epoch": 0.18,
+      "learning_rate": 3.6e-06,
+      "loss": 1.2063,
+      "step": 6
+    },
+    {
+      "epoch": 0.21,
+      "learning_rate": 4.2000000000000004e-06,
+      "loss": 1.211,
+      "step": 7
+    },
+    {
+      "epoch": 0.24,
+      "learning_rate": 4.800000000000001e-06,
+      "loss": 1.2042,
+      "step": 8
+    },
+    {
+      "epoch": 0.24,
+      "eval_loss": 1.1931304931640625,
+      "eval_runtime": 4.3471,
+      "eval_samples_per_second": 2.53,
+      "eval_steps_per_second": 0.46,
+      "step": 8
+    },
+    {
+      "epoch": 0.27,
+      "learning_rate": 5.4e-06,
+      "loss": 1.2042,
+      "step": 9
+    },
+    {
+      "epoch": 0.3,
+      "learning_rate": 6e-06,
+      "loss": 1.1951,
+      "step": 10
+    },
+    {
+      "epoch": 0.33,
+      "learning_rate": 6.6e-06,
+      "loss": 1.2194,
+      "step": 11
+    },
+    {
+      "epoch": 0.36,
+      "learning_rate": 7.2e-06,
+      "loss": 1.1958,
+      "step": 12
+    },
+    {
+      "epoch": 0.36,
+      "eval_loss": 1.1925488710403442,
+      "eval_runtime": 4.3544,
+      "eval_samples_per_second": 2.526,
+      "eval_steps_per_second": 0.459,
+      "step": 12
+    },
+    {
+      "epoch": 0.39,
+      "learning_rate": 7.8e-06,
+      "loss": 1.2059,
+      "step": 13
+    },
+    {
+      "epoch": 0.42,
+      "learning_rate": 8.400000000000001e-06,
+      "loss": 1.1939,
+      "step": 14
+    },
+    {
+      "epoch": 0.45,
+      "learning_rate": 9e-06,
+      "loss": 1.2042,
+      "step": 15
+    },
+    {
+      "epoch": 0.48,
+      "learning_rate": 9.600000000000001e-06,
+      "loss": 1.1974,
+      "step": 16
+    },
+    {
+      "epoch": 0.48,
+      "eval_loss": 1.1915441751480103,
+      "eval_runtime": 4.3592,
+      "eval_samples_per_second": 2.523,
+      "eval_steps_per_second": 0.459,
+      "step": 16
+    },
+    {
+      "epoch": 0.51,
+      "learning_rate": 1.02e-05,
+      "loss": 1.1917,
+      "step": 17
+    },
+    {
+      "epoch": 0.54,
+      "learning_rate": 1.08e-05,
+      "loss": 1.2156,
+      "step": 18
+    },
+    {
+      "epoch": 0.57,
+      "learning_rate": 1.1400000000000001e-05,
+      "loss": 1.2204,
+      "step": 19
+    },
+    {
+      "epoch": 0.6,
+      "learning_rate": 1.2e-05,
+      "loss": 1.1997,
+      "step": 20
+    },
+    {
+      "epoch": 0.6,
+      "eval_loss": 1.190488576889038,
+      "eval_runtime": 4.3516,
+      "eval_samples_per_second": 2.528,
+      "eval_steps_per_second": 0.46,
+      "step": 20
+    },
+    {
+      "epoch": 0.63,
+      "learning_rate": 1.26e-05,
+      "loss": 1.2041,
+      "step": 21
+    },
+    {
+      "epoch": 0.66,
+      "learning_rate": 1.32e-05,
+      "loss": 1.1954,
+      "step": 22
+    },
+    {
+      "epoch": 0.69,
+      "learning_rate": 1.3800000000000002e-05,
+      "loss": 1.1951,
+      "step": 23
+    },
+    {
+      "epoch": 0.72,
+      "learning_rate": 1.44e-05,
+      "loss": 1.2017,
+      "step": 24
+    },
+    {
+      "epoch": 0.72,
+      "eval_loss": 1.188790202140808,
+      "eval_runtime": 4.3616,
+      "eval_samples_per_second": 2.522,
+      "eval_steps_per_second": 0.459,
+      "step": 24
+    }
+  ],
+  "max_steps": 33,
+  "num_train_epochs": 1,
+  "total_flos": 4.871267940512563e+17,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-24/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a1a9f601b9cbe6df3edd2801886775e4feabba748432d1673b6b79c84c544a83
+size 3963

checkpoint-28/adapter_config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "base_model_name_or_path": "Neko-Institute-of-Science/LLaMA-13B-HF",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "lora_alpha": 128,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-28/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5e8d2fbf4ea569ee2ae03234895a83f455ea842b718d146ffe20600f5da562fc
+size 209772877

checkpoint-28/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:15d6d9156b9c48cbd6638e877e40f2162d325cce254952ab2d172b0473559e31
+size 105251781

checkpoint-28/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:df4fd5b980cade88431c5fd45e7586b73785bbf90f916fda9aa3838ac320199b
+size 14575

checkpoint-28/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:02ee3834d7e7dfd7d399c6f92226ac78e6cd6ee79638559c274dd0f5400d09ff
+size 627

checkpoint-28/trainer_state.json ADDED Viewed

	@@ -0,0 +1,240 @@

+{
+  "best_metric": 1.1863083839416504,
+  "best_model_checkpoint": "output_dir/checkpoint-28",
+  "epoch": 0.8428974600188147,
+  "global_step": 28,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.03,
+      "learning_rate": 6.000000000000001e-07,
+      "loss": 1.2047,
+      "step": 1
+    },
+    {
+      "epoch": 0.06,
+      "learning_rate": 1.2000000000000002e-06,
+      "loss": 1.2148,
+      "step": 2
+    },
+    {
+      "epoch": 0.09,
+      "learning_rate": 1.8e-06,
+      "loss": 1.2134,
+      "step": 3
+    },
+    {
+      "epoch": 0.12,
+      "learning_rate": 2.4000000000000003e-06,
+      "loss": 1.2068,
+      "step": 4
+    },
+    {
+      "epoch": 0.12,
+      "eval_loss": 1.1929587125778198,
+      "eval_runtime": 4.3547,
+      "eval_samples_per_second": 2.526,
+      "eval_steps_per_second": 0.459,
+      "step": 4
+    },
+    {
+      "epoch": 0.15,
+      "learning_rate": 3e-06,
+      "loss": 1.2093,
+      "step": 5
+    },
+    {
+      "epoch": 0.18,
+      "learning_rate": 3.6e-06,
+      "loss": 1.2063,
+      "step": 6
+    },
+    {
+      "epoch": 0.21,
+      "learning_rate": 4.2000000000000004e-06,
+      "loss": 1.211,
+      "step": 7
+    },
+    {
+      "epoch": 0.24,
+      "learning_rate": 4.800000000000001e-06,
+      "loss": 1.2042,
+      "step": 8
+    },
+    {
+      "epoch": 0.24,
+      "eval_loss": 1.1931304931640625,
+      "eval_runtime": 4.3471,
+      "eval_samples_per_second": 2.53,
+      "eval_steps_per_second": 0.46,
+      "step": 8
+    },
+    {
+      "epoch": 0.27,
+      "learning_rate": 5.4e-06,
+      "loss": 1.2042,
+      "step": 9
+    },
+    {
+      "epoch": 0.3,
+      "learning_rate": 6e-06,
+      "loss": 1.1951,
+      "step": 10
+    },
+    {
+      "epoch": 0.33,
+      "learning_rate": 6.6e-06,
+      "loss": 1.2194,
+      "step": 11
+    },
+    {
+      "epoch": 0.36,
+      "learning_rate": 7.2e-06,
+      "loss": 1.1958,
+      "step": 12
+    },
+    {
+      "epoch": 0.36,
+      "eval_loss": 1.1925488710403442,
+      "eval_runtime": 4.3544,
+      "eval_samples_per_second": 2.526,
+      "eval_steps_per_second": 0.459,
+      "step": 12
+    },
+    {
+      "epoch": 0.39,
+      "learning_rate": 7.8e-06,
+      "loss": 1.2059,
+      "step": 13
+    },
+    {
+      "epoch": 0.42,
+      "learning_rate": 8.400000000000001e-06,
+      "loss": 1.1939,
+      "step": 14
+    },
+    {
+      "epoch": 0.45,
+      "learning_rate": 9e-06,
+      "loss": 1.2042,
+      "step": 15
+    },
+    {
+      "epoch": 0.48,
+      "learning_rate": 9.600000000000001e-06,
+      "loss": 1.1974,
+      "step": 16
+    },
+    {
+      "epoch": 0.48,
+      "eval_loss": 1.1915441751480103,
+      "eval_runtime": 4.3592,
+      "eval_samples_per_second": 2.523,
+      "eval_steps_per_second": 0.459,
+      "step": 16
+    },
+    {
+      "epoch": 0.51,
+      "learning_rate": 1.02e-05,
+      "loss": 1.1917,
+      "step": 17
+    },
+    {
+      "epoch": 0.54,
+      "learning_rate": 1.08e-05,
+      "loss": 1.2156,
+      "step": 18
+    },
+    {
+      "epoch": 0.57,
+      "learning_rate": 1.1400000000000001e-05,
+      "loss": 1.2204,
+      "step": 19
+    },
+    {
+      "epoch": 0.6,
+      "learning_rate": 1.2e-05,
+      "loss": 1.1997,
+      "step": 20
+    },
+    {
+      "epoch": 0.6,
+      "eval_loss": 1.190488576889038,
+      "eval_runtime": 4.3516,
+      "eval_samples_per_second": 2.528,
+      "eval_steps_per_second": 0.46,
+      "step": 20
+    },
+    {
+      "epoch": 0.63,
+      "learning_rate": 1.26e-05,
+      "loss": 1.2041,
+      "step": 21
+    },
+    {
+      "epoch": 0.66,
+      "learning_rate": 1.32e-05,
+      "loss": 1.1954,
+      "step": 22
+    },
+    {
+      "epoch": 0.69,
+      "learning_rate": 1.3800000000000002e-05,
+      "loss": 1.1951,
+      "step": 23
+    },
+    {
+      "epoch": 0.72,
+      "learning_rate": 1.44e-05,
+      "loss": 1.2017,
+      "step": 24
+    },
+    {
+      "epoch": 0.72,
+      "eval_loss": 1.188790202140808,
+      "eval_runtime": 4.3616,
+      "eval_samples_per_second": 2.522,
+      "eval_steps_per_second": 0.459,
+      "step": 24
+    },
+    {
+      "epoch": 0.75,
+      "learning_rate": 1.5e-05,
+      "loss": 1.1908,
+      "step": 25
+    },
+    {
+      "epoch": 0.78,
+      "learning_rate": 1.56e-05,
+      "loss": 1.2032,
+      "step": 26
+    },
+    {
+      "epoch": 0.81,
+      "learning_rate": 1.62e-05,
+      "loss": 1.1876,
+      "step": 27
+    },
+    {
+      "epoch": 0.84,
+      "learning_rate": 1.6800000000000002e-05,
+      "loss": 1.1984,
+      "step": 28
+    },
+    {
+      "epoch": 0.84,
+      "eval_loss": 1.1863083839416504,
+      "eval_runtime": 4.3585,
+      "eval_samples_per_second": 2.524,
+      "eval_steps_per_second": 0.459,
+      "step": 28
+    }
+  ],
+  "max_steps": 33,
+  "num_train_epochs": 1,
+  "total_flos": 5.6831459305979904e+17,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-28/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a1a9f601b9cbe6df3edd2801886775e4feabba748432d1673b6b79c84c544a83
+size 3963

checkpoint-32/adapter_config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "base_model_name_or_path": "Neko-Institute-of-Science/LLaMA-13B-HF",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "lora_alpha": 128,
+  "lora_dropout": 0.05,
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "task_type": "CAUSAL_LM"
+}

checkpoint-32/adapter_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ccee66b163e05aa00135a0438b43b449a13d83365803da247811af514a34eaee
+size 209772877

checkpoint-32/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:62115a02e727a9ceb6366da0ee480ec2e067830213f1b658de47f54eab16cdc4
+size 105251781

checkpoint-32/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d49cbc3e8793c68529a4c3f5e53bfe261a8a6cd135f170aa5936a36af28b2f6e
+size 14575

checkpoint-32/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6a83cccaf2d0ded4c5c9f52e1ad41bad842f388e6da3dfdcc6dacc7a971a91f2
+size 627

checkpoint-32/trainer_state.json ADDED Viewed

	@@ -0,0 +1,272 @@

+{
+  "best_metric": 1.1863083839416504,
+  "best_model_checkpoint": "output_dir/checkpoint-28",
+  "epoch": 0.9633113828786454,
+  "global_step": 32,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.03,
+      "learning_rate": 6.000000000000001e-07,
+      "loss": 1.2047,
+      "step": 1
+    },
+    {
+      "epoch": 0.06,
+      "learning_rate": 1.2000000000000002e-06,
+      "loss": 1.2148,
+      "step": 2
+    },
+    {
+      "epoch": 0.09,
+      "learning_rate": 1.8e-06,
+      "loss": 1.2134,
+      "step": 3
+    },
+    {
+      "epoch": 0.12,
+      "learning_rate": 2.4000000000000003e-06,
+      "loss": 1.2068,
+      "step": 4
+    },
+    {
+      "epoch": 0.12,
+      "eval_loss": 1.1929587125778198,
+      "eval_runtime": 4.3547,
+      "eval_samples_per_second": 2.526,
+      "eval_steps_per_second": 0.459,
+      "step": 4
+    },
+    {
+      "epoch": 0.15,
+      "learning_rate": 3e-06,
+      "loss": 1.2093,
+      "step": 5
+    },
+    {
+      "epoch": 0.18,
+      "learning_rate": 3.6e-06,
+      "loss": 1.2063,
+      "step": 6
+    },
+    {
+      "epoch": 0.21,
+      "learning_rate": 4.2000000000000004e-06,
+      "loss": 1.211,
+      "step": 7
+    },
+    {
+      "epoch": 0.24,
+      "learning_rate": 4.800000000000001e-06,
+      "loss": 1.2042,
+      "step": 8
+    },
+    {
+      "epoch": 0.24,
+      "eval_loss": 1.1931304931640625,
+      "eval_runtime": 4.3471,
+      "eval_samples_per_second": 2.53,
+      "eval_steps_per_second": 0.46,
+      "step": 8
+    },
+    {
+      "epoch": 0.27,
+      "learning_rate": 5.4e-06,
+      "loss": 1.2042,
+      "step": 9
+    },
+    {
+      "epoch": 0.3,
+      "learning_rate": 6e-06,
+      "loss": 1.1951,
+      "step": 10
+    },
+    {
+      "epoch": 0.33,
+      "learning_rate": 6.6e-06,
+      "loss": 1.2194,
+      "step": 11
+    },
+    {
+      "epoch": 0.36,
+      "learning_rate": 7.2e-06,
+      "loss": 1.1958,
+      "step": 12
+    },
+    {
+      "epoch": 0.36,
+      "eval_loss": 1.1925488710403442,
+      "eval_runtime": 4.3544,
+      "eval_samples_per_second": 2.526,
+      "eval_steps_per_second": 0.459,
+      "step": 12
+    },
+    {
+      "epoch": 0.39,
+      "learning_rate": 7.8e-06,
+      "loss": 1.2059,
+      "step": 13
+    },
+    {
+      "epoch": 0.42,
+      "learning_rate": 8.400000000000001e-06,
+      "loss": 1.1939,
+      "step": 14
+    },
+    {
+      "epoch": 0.45,
+      "learning_rate": 9e-06,
+      "loss": 1.2042,
+      "step": 15
+    },
+    {
+      "epoch": 0.48,
+      "learning_rate": 9.600000000000001e-06,
+      "loss": 1.1974,
+      "step": 16
+    },
+    {
+      "epoch": 0.48,
+      "eval_loss": 1.1915441751480103,
+      "eval_runtime": 4.3592,
+      "eval_samples_per_second": 2.523,
+      "eval_steps_per_second": 0.459,
+      "step": 16
+    },
+    {
+      "epoch": 0.51,
+      "learning_rate": 1.02e-05,
+      "loss": 1.1917,
+      "step": 17
+    },
+    {
+      "epoch": 0.54,
+      "learning_rate": 1.08e-05,
+      "loss": 1.2156,
+      "step": 18
+    },
+    {
+      "epoch": 0.57,
+      "learning_rate": 1.1400000000000001e-05,
+      "loss": 1.2204,
+      "step": 19
+    },
+    {
+      "epoch": 0.6,
+      "learning_rate": 1.2e-05,
+      "loss": 1.1997,
+      "step": 20
+    },
+    {
+      "epoch": 0.6,
+      "eval_loss": 1.190488576889038,
+      "eval_runtime": 4.3516,
+      "eval_samples_per_second": 2.528,
+      "eval_steps_per_second": 0.46,
+      "step": 20
+    },
+    {
+      "epoch": 0.63,
+      "learning_rate": 1.26e-05,
+      "loss": 1.2041,
+      "step": 21
+    },
+    {
+      "epoch": 0.66,
+      "learning_rate": 1.32e-05,
+      "loss": 1.1954,
+      "step": 22
+    },
+    {
+      "epoch": 0.69,
+      "learning_rate": 1.3800000000000002e-05,
+      "loss": 1.1951,
+      "step": 23
+    },
+    {
+      "epoch": 0.72,
+      "learning_rate": 1.44e-05,
+      "loss": 1.2017,
+      "step": 24
+    },
+    {
+      "epoch": 0.72,
+      "eval_loss": 1.188790202140808,
+      "eval_runtime": 4.3616,
+      "eval_samples_per_second": 2.522,
+      "eval_steps_per_second": 0.459,
+      "step": 24
+    },
+    {
+      "epoch": 0.75,
+      "learning_rate": 1.5e-05,
+      "loss": 1.1908,
+      "step": 25
+    },
+    {
+      "epoch": 0.78,
+      "learning_rate": 1.56e-05,
+      "loss": 1.2032,
+      "step": 26
+    },
+    {
+      "epoch": 0.81,
+      "learning_rate": 1.62e-05,
+      "loss": 1.1876,
+      "step": 27
+    },
+    {
+      "epoch": 0.84,
+      "learning_rate": 1.6800000000000002e-05,
+      "loss": 1.1984,
+      "step": 28
+    },
+    {
+      "epoch": 0.84,
+      "eval_loss": 1.1863083839416504,
+      "eval_runtime": 4.3585,
+      "eval_samples_per_second": 2.524,
+      "eval_steps_per_second": 0.459,
+      "step": 28
+    },
+    {
+      "epoch": 0.87,
+      "learning_rate": 1.74e-05,
+      "loss": 1.202,
+      "step": 29
+    },
+    {
+      "epoch": 0.9,
+      "learning_rate": 1.8e-05,
+      "loss": 1.1991,
+      "step": 30
+    },
+    {
+      "epoch": 0.93,
+      "learning_rate": 1.86e-05,
+      "loss": 1.2004,
+      "step": 31
+    },
+    {
+      "epoch": 0.96,
+      "learning_rate": 1.9200000000000003e-05,
+      "loss": 1.1907,
+      "step": 32
+    },
+    {
+      "epoch": 0.96,
+      "eval_loss": 1.1872574090957642,
+      "eval_runtime": 4.3637,
+      "eval_samples_per_second": 2.521,
+      "eval_steps_per_second": 0.458,
+      "step": 32
+    }
+  ],
+  "max_steps": 33,
+  "num_train_epochs": 1,
+  "total_flos": 6.495023920683418e+17,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-32/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a1a9f601b9cbe6df3edd2801886775e4feabba748432d1673b6b79c84c544a83
+size 3963

documentation/hyperparameters.yml ADDED Viewed

	@@ -0,0 +1,69 @@

+base_model: Neko-Institute-of-Science/LLaMA-13B-HF
+base_model_config: Neko-Institute-of-Science/LLaMA-13B-HF
+model_type: LlamaForCausalLM
+tokenizer_type: LlamaTokenizer
+load_in_8bit: true
+load_4bit:
+datasets:
+  - path: practicaldreamer/RPGPT_PublicDomain-ShareGPT
+    data_files: RPGPT_PublicDomain_v3-sharegpt.json
+    type: sharegpt
+dataset_prepared_path: data/last_run_prepared
+val_set_size: 0.0025
+adapter: lora
+lora_model_dir:
+sequence_len: 2048
+max_packed_sequence_len:
+lora_r: 64
+lora_alpha: 128
+lora_dropout: 0.05
+lora_target_modules:
+  - q_proj
+  - v_proj
+#  - k_proj
+#  - o_proj
+lora_fan_in_fan_out: false
+wandb_project:
+wandb_watch:
+wandb_run_id:
+wandb_log_model: checkpoint
+output_dir: output_dir
+batch_size: 128
+micro_batch_size: 4
+eval_batch_size: 1
+num_epochs: 1
+warmup_steps: 50
+logging_steps:
+learning_rate: 0.00003
+optimizer: adamw_bnb_8bit
+torchdistx_path:
+lr_scheduler: cosine
+train_on_inputs: false
+group_by_length: false
+bf16: true
+tf32: true
+gradient_checkpointing: true
+early_stopping_patience: 3
+resume_from_checkpoint:
+auto_resume_from_checkpoints:
+local_rank:
+xformers_attention: true
+flash_attention:
+gptq_groupsize:
+gptq_model_v1:
+save_steps: 4
+debug:
+deepspeed:
+weight_decay: 0.0
+fsdp:
+fsdp_config:
+  fsdp_transformer_layer_cls_to_wrap:
+  fsdp_min_num_params: 2000
+  fsdp_backward_prefetch:
+    - backward_pre
+  limit_all_gathers: false
+special_tokens:
+  pad_token: "[PAD]"
+  bos_token: "<s>"
+  eos_token: "</s>"
+  unk_token: "<unk>"

documentation/preprocessed_sample.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

documentation/requirements.txt ADDED Viewed

	@@ -0,0 +1,92 @@

+accelerate @ git+https://github.com/huggingface/accelerate.git@24ae624d96866e3f993a13fc341ea0dcb68b1470
+aiohttp==3.8.4
+aiosignal==1.3.1
+alpaca-lora-4bit @ git+https://github.com/winglian/alpaca_lora_4bit.git@1b4a376ea816eb2417404b4d1ac27fa16471588a
+appdirs==1.4.4
+async-timeout==4.0.2
+attrdict==2.0.1
+attrs==23.1.0
+-e git+https://github.com/winglian/axolotl@a10a8265efde4ec61037560e3b8e2e31dab984af#egg=axolotl
+bitsandbytes==0.37.2
+black==23.3.0
+certifi==2022.12.7
+charset-normalizer==3.1.0
+click==8.1.3
+cmake==3.26.3
+colorama==0.4.6
+datasets==2.12.0
+deepspeed==0.9.4
+dill==0.3.6
+docker-pycreds==0.4.0
+einops==0.6.1
+filelock==3.12.0
+fire==0.5.0
+flash-attn==1.0.4
+frozenlist==1.3.3
+fsspec==2023.4.0
+gitdb==4.0.10
+GitPython==3.1.31
+hjson==3.1.0
+huggingface-hub==0.14.1
+idna==3.4
+Jinja2==3.1.2
+lit==16.0.2
+MarkupSafe==2.1.2
+mpmath==1.3.0
+multidict==6.0.4
+multiprocess==0.70.14
+mypy-extensions==1.0.0
+networkx==3.1
+ninja==1.11.1
+numpy==1.24.3
+nvidia-cublas-cu11==11.10.3.66
+nvidia-cuda-cupti-cu11==11.7.101
+nvidia-cuda-nvrtc-cu11==11.7.99
+nvidia-cuda-runtime-cu11==11.7.99
+nvidia-cudnn-cu11==8.5.0.96
+nvidia-cufft-cu11==10.9.0.58
+nvidia-curand-cu11==10.2.10.91
+nvidia-cusolver-cu11==11.4.0.1
+nvidia-cusparse-cu11==11.7.4.91
+nvidia-nccl-cu11==2.14.3
+nvidia-nvtx-cu11==11.7.91
+packaging==23.1
+pandas==2.0.1
+pathspec==0.11.1
+pathtools==0.1.2
+peft @ git+https://github.com/huggingface/peft.git@70af02a2bca5a63921790036b2c9430edf4037e2
+platformdirs==3.5.0
+protobuf==4.22.4
+psutil==5.9.5
+py-cpuinfo==9.0.0
+pyarrow==12.0.0
+pydantic==1.10.7
+pyre-extensions==0.0.29
+python-dateutil==2.8.2
+pytz==2023.3
+PyYAML==6.0
+regex==2023.5.5
+requests==2.30.0
+responses==0.18.0
+safetensors==0.3.1
+sentencepiece==0.1.99
+sentry-sdk==1.21.1
+setproctitle==1.3.2
+six==1.16.0
+smmap==5.0.0
+sympy==1.11.1
+termcolor==2.3.0
+tokenizers==0.13.3
+tomli==2.0.1
+torch==2.0.0
+tqdm==4.65.0
+transformers @ git+https://github.com/huggingface/transformers.git@799df10aef3abfe6158c83daf0a9eacf8f6f0a1f
+triton==2.0.0
+typing-inspect==0.8.0
+typing_extensions==4.5.0
+tzdata==2023.3
+urllib3==2.0.2
+wandb==0.15.4
+xformers==0.0.19
+xxhash==3.2.0
+yarl==1.9.2

documentation/wandb.info ADDED Viewed

	@@ -0,0 +1 @@


1	+ https://wandb.ai/practicaldreamer/rpgpt/runs/b3sznjpz