allura-org
/

G2-9B-Sugarquill-v0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AuriAetherwiing commited on 8 days ago

Commit

2caa7a2

•

1 Parent(s): 1701067

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -15,10 +15,12 @@ I was trying to diversify Gemma's prose, without completely destroying it's smar
 Should be usable both for RP and raw completion storywriting.
 I originally planned to use this in a merge, but I feel like this model is interesting enough to be released on it's own as well.
 **Training notes.**
 This model was trained for 2 epochs on 10k rows (~18.7M tokens), taken equally from Erebus-87k and r_shortstories_24k datasets. It was trained on 8xH100 SXM node for 30 minutes with rsLoRA.
 I got complete nonsense reported to my wandb during this run, and logging stopped altogether after step 13 for some reason. Seems to be directly related to Gemma, as my training setup worked flawlessly for Qwen.
 **Format**

 Should be usable both for RP and raw completion storywriting.
 I originally planned to use this in a merge, but I feel like this model is interesting enough to be released on it's own as well.
+Model was trained by Auri.
 **Training notes.**
 This model was trained for 2 epochs on 10k rows (~18.7M tokens), taken equally from Erebus-87k and r_shortstories_24k datasets. It was trained on 8xH100 SXM node for 30 minutes with rsLoRA.
 I got complete nonsense reported to my wandb during this run, and logging stopped altogether after step 13 for some reason. Seems to be directly related to Gemma, as my training setup worked flawlessly for Qwen.
+Thanks to Kearm for helping with setting up LF on that node and to Featherless for providing it for EVA-Qwen2.5 (and this model, unknowingly lol) training.
 **Format**