ecker
/

vall-e

Model card Files Files and versions Community

ecker commited on Sep 8

Commit

2d624f1

•

1 Parent(s): 67ce2fc

Update README.md

Browse files

Files changed (1) hide show

README.md +20 -1

README.md CHANGED Viewed

@@ -81,4 +81,23 @@ Some additional configurations have been explored with, but experiments have not
 Some current "achitectural features" are in-use, but their effects need to be experimented with further:
 * `split_classifier_heads` is still a mystery whether it's truly helpful or not (each RVQ level gets its own output head).
 * `audio_embeddings_sum` is also a mystery whether it matters if each later RVQ level should "see" the past levels through summing embeddings, or if not doing it is preferable.
-* Disabling `unified_position_ids` seems to help quality more often than not, but I'm still unsure if it's beneficial in practice.

 Some current "achitectural features" are in-use, but their effects need to be experimented with further:
 * `split_classifier_heads` is still a mystery whether it's truly helpful or not (each RVQ level gets its own output head).
 * `audio_embeddings_sum` is also a mystery whether it matters if each later RVQ level should "see" the past levels through summing embeddings, or if not doing it is preferable.
+* Disabling `unified_position_ids` seems to help quality more often than not, but I'm still unsure if it's beneficial in practice.
+## LoRAs
+This repo also contains some LoRAs to serve as a reference under `./loras/`.
+Using a LoRA is the same as a base model, except you're required to have the base model already (obviously). Just use the LoRA's config YAML to load from instead to use it.
+The only caveat is that my original dataset *does* contain these samples already, but given the sheer size of it, they're probably underutilized.
+* However, the base model already has *almost adequate* output from these speakers, but not enough to be satisfactory.
+* `config.lora.glados.yaml` / `lora-glados-r128-a128`:
+  + A simple LoRA of GLaDOS from both Portal and Portal 2.
+  + Trained for 250 steps (48000 samples, 821 samples per epoch).
+* `config.lora.sam.yaml` / `lora-sam-r128-a128`:
+  + A simple LoRA of Sam from the non-remaster Sam and Max Telltale games.
+  + Trained for 250 steps (48000 samples, 1555 samples per epoch).
+* `config.lora.max.yaml` / `lora-max-r128-a128`:
+  + A simple LoRA of Max from the non-remaster Sam and Max Telltale games.
+  + Trained for 250 steps (48000 samples, 1292 samples per epoch).