XeTute commited on
Commit
9ca0e49
·
verified ·
1 Parent(s): 1ab99e4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -3
README.md CHANGED
@@ -1,3 +1,68 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: mit
4
+ base_model:
5
+ - XeTute/Phantasor_V0.1-137M
6
+ tags:
7
+ - llama-factory
8
+ - full
9
+ - generated_from_trainer
10
+ - story
11
+ - tiny
12
+ - chinese
13
+ - english
14
+ datasets:
15
+ - Chamoda/atlas-storyteller-1000
16
+ - jaydenccc/AI_Storyteller_Dataset
17
+ - zxbsmk/webnovel_cn
18
+ - XeTute/Pakistan-China-Alpaca
19
+ language:
20
+ - zh
21
+ - en
22
+ pipeline_tag: text-generation
23
+ ---
24
+
25
+ > [!TIP]
26
+ > Model is still in its testing phase. We don't recommend it for high-end production enviroments, it's only a model for story-generation.
27
+ > Model trained using LLaMA-Factory by Asadullah Hamzah at XeTute Technologies.
28
+
29
+ # Phantasor V0.2
30
+ We introduce Phantasor V0.2, the continuation of [Phantasor V0.1](https://huggingface.co/XeTute/Phantasor_V0.1-137M). It has been trained ontop of V0.1 using a new dataset (more details below) and the old datasets.
31
+ Licensed under MIT, feel free to use it in your personal projects, both commercially and privately, Since this is V0.2, we're open to feedback to improve our project(s).
32
+ *The Chat-Template used is Alpaca. For correct usage, insert your prompt as a **system** prompt. The model can also be used without any template to continue a sequence of text.*
33
+ [You can find the GGUF version here.](https://huggingface.co/XeTute/Phantasor_V0.2-137M-GGUF)
34
+
35
+ ## Example =)
36
+ ```txt
37
+ Coming later
38
+ ```
39
+
40
+ ```txt
41
+ Coming later
42
+ ```
43
+
44
+ ## Training
45
+ This model was trained on all samples, tokens included in:
46
+ - [Chamoda/atlas-storyteller-1000](https://huggingface.co/datasets/Chamoda/atlas-storyteller-1000)
47
+ - [jaydenccc/AI_Storyteller_Dataset](https://huggingface.co/datasets/jaydenccc/AI_Storyteller_Dataset)
48
+ - [zxbsmk/webnovel_cn](https://huggingface.co/datasets/zxbsmk/webnovel_cn)
49
+
50
+ for exactly 4.0 epochs on all model parameters. Following is the loss curve, updated with each training step over all four epochs.
51
+ ![training_loss.png](https://huggingface.co/XeTute/Phantasor_V0.2-137M/resolve/main/training_loss.png)
52
+ Instead of AdamW, which is often used for large GPTs, we used **SGD**, which enabled the model to generalize better, which can be seen when using the model on non-dataset prompts.
53
+
54
+ ## Finished Model
55
+ - ~137M Parameters, all of which are trainable
56
+ - 1024 / 1k input tokens / context length, from which all were used
57
+ - A loss ~1.2 on all samples (see Files => train_results.json)
58
+
59
+ This is very good performance for the V0.2.
60
+
61
+ # Our platforms
62
+ ## Socials
63
+ [BlueSky](https://bsky.app/profile/xetute.bsky.social) | [YouTube](https://www.youtube.com/@XeTuteTechnologies) | [HuggingFace 🤗](https://huggingface.co/XeTute) | [Ko-Fi / Financially Support Us](https://ko-fi.com/XeTute)
64
+
65
+ ## Our Platforms
66
+ [Our Webpage](https://xetute.com) | [PhantasiaAI](https://xetute.com/PhantasiaAI)
67
+
68
+ Have a great day!