alac commited on
Commit
dd96978
1 Parent(s): 04dc211

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - llama-2
6
+ - instruct
7
+ - instruction
8
+ - writing
9
+ - story
10
+ pipeline_tag: text-generation
11
+ license: other
12
+ ---
13
+
14
+ # Waxwing-Storytelling-70B-LoRA model card
15
+
16
+ Waxwing is a storytelling lora for Llama 2 70B.
17
+ - Guide the story with Waxwing's turn-based instruction system.
18
+ - Tailor the feel of your story using style tags.
19
+ - Experience storytelling free of ChatGPT's idiosyncrasies, thanks to a "human-generated" dataset of public domain writing. Waxwing avoids GPT-isms like positivity bias, "bond" emphasis, rushed endings and exaggerated stylistic tics.
20
+
21
+ Waxwing is available:
22
+ - LoRA: As a LoRA on the [main branch](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/main) and can be applied at runtime on any variant of the Llama 2 70B base model.
23
+ - 16fp model: Merged into the base Llama 2 model, in full precision in the [16fp](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/16fp) branch.
24
+ - Quantized for used with Exllama 2:
25
+ - [2.5bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/2.5bpw)
26
+ - [3.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/3.0bpw)
27
+ - [4.65bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/4.65bpw)
28
+ - [6.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/6.0bpw)
29
+ - [8.0bpw](https://huggingface.co/alac/Waxwing-Storytelling-70B-LoRA/tree/8.0bpw)
30
+
31
+ By using this model, you take full responsibility for anything done with its outputs.
32
+
33
+
34
+ ## Model Details
35
+
36
+ ### Model Description
37
+
38
+ - **Developed by:** alac
39
+ - **Model Type:** QLoRA
40
+ - **Finetuned from model:** Llama-2 70B
41
+ - **Language(s):** English
42
+
43
+
44
+ ### Dataset
45
+
46
+ Waxwing was trained with a small dataset gathered from public domain writing. The exact dataset will remain private, but the code used to generate prompts and metadata is available on [github](https://github.com/alac/txt_to_dataset).
47
+ Upstage's [SOLAR](https://huggingface.co/upstage/SOLAR-0-70b-16bit) model was used to tag the dataset.
48
+
49
+
50
+ ### Prompt Template
51
+
52
+ ```
53
+ ### System:
54
+ A chat between a user and a writing assistant.
55
+ {context}
56
+
57
+ ### User:
58
+ {style tags}
59
+ Write a scene where: {events that should happen in the next scene}
60
+
61
+ ### Assistant:
62
+ {output}
63
+ ```
64
+ `context` is an optional story synopsis.
65
+ `style tags` should be a string along the lines of:
66
+ ```
67
+ Tone: {list of tones}. Writing style: {list of writing styles}.
68
+ Written with {slow|medium|fast} pacing, in moment to moment detail, in {abstract|selective|vivid sensory} detail, from a {First|Third Person (Character)} perspective.
69
+ ```
70
+ The exact values it was trained on are in the `dataset_tags.json` file. Anecdotally, it works better with a subset of the style tags used (`Tone: tense`) or with tags that are complementary (`Tone: tense, mysterious. Writing style: dramatic. Written in abstract detail.`). It's unclear how well Waxwing responds to tags that it was not trained on (e.g. 'genre').
71
+
72
+ For SillyTavern users, the `style tags` work well in the "Author's Note" field at depth 1. User messages should begin with `Write a scene where: `; to continue a scene, just type `continue`. Most testing was done using the [Genesis](https://github.com/SillyTavern/SillyTavern/blob/8e73882c9ba7301c9163befbe445686a79d4a9a8/public/TextGen%20Settings/NovelAI%20(Genesis).settings) preset.
73
+
74
+
75
+ ### Training
76
+
77
+ Waxwing was trained on a single machine with 72GB of VRAM. The training parameters are available in the `training_parameters.json` file of the main branch. The software used to train was FartyPants' [Training_PRO](https://github.com/FartyPants/Training_PRO) extension for the Oobabooga Text Generation WebUI.