Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,68 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: transformers
|
3 |
+
license: mit
|
4 |
+
base_model:
|
5 |
+
- XeTute/Phantasor_V0.1-137M
|
6 |
+
tags:
|
7 |
+
- llama-factory
|
8 |
+
- full
|
9 |
+
- generated_from_trainer
|
10 |
+
- story
|
11 |
+
- tiny
|
12 |
+
- chinese
|
13 |
+
- english
|
14 |
+
datasets:
|
15 |
+
- Chamoda/atlas-storyteller-1000
|
16 |
+
- jaydenccc/AI_Storyteller_Dataset
|
17 |
+
- zxbsmk/webnovel_cn
|
18 |
+
- XeTute/Pakistan-China-Alpaca
|
19 |
+
language:
|
20 |
+
- zh
|
21 |
+
- en
|
22 |
+
pipeline_tag: text-generation
|
23 |
+
---
|
24 |
+
|
25 |
+
> [!TIP]
|
26 |
+
> Model is still in its testing phase. We don't recommend it for high-end production enviroments, it's only a model for story-generation.
|
27 |
+
> Model trained using LLaMA-Factory by Asadullah Hamzah at XeTute Technologies.
|
28 |
+
|
29 |
+
# Phantasor V0.2
|
30 |
+
We introduce Phantasor V0.2, the continuation of [Phantasor V0.1](https://huggingface.co/XeTute/Phantasor_V0.1-137M). It has been trained ontop of V0.1 using a new dataset (more details below) and the old datasets.
|
31 |
+
Licensed under MIT, feel free to use it in your personal projects, both commercially and privately, Since this is V0.2, we're open to feedback to improve our project(s).
|
32 |
+
*The Chat-Template used is Alpaca. For correct usage, insert your prompt as a **system** prompt. The model can also be used without any template to continue a sequence of text.*
|
33 |
+
[You can find the GGUF version here.](https://huggingface.co/XeTute/Phantasor_V0.2-137M-GGUF)
|
34 |
+
|
35 |
+
## Example =)
|
36 |
+
```txt
|
37 |
+
Coming later
|
38 |
+
```
|
39 |
+
|
40 |
+
```txt
|
41 |
+
Coming later
|
42 |
+
```
|
43 |
+
|
44 |
+
## Training
|
45 |
+
This model was trained on all samples, tokens included in:
|
46 |
+
- [Chamoda/atlas-storyteller-1000](https://huggingface.co/datasets/Chamoda/atlas-storyteller-1000)
|
47 |
+
- [jaydenccc/AI_Storyteller_Dataset](https://huggingface.co/datasets/jaydenccc/AI_Storyteller_Dataset)
|
48 |
+
- [zxbsmk/webnovel_cn](https://huggingface.co/datasets/zxbsmk/webnovel_cn)
|
49 |
+
|
50 |
+
for exactly 4.0 epochs on all model parameters. Following is the loss curve, updated with each training step over all four epochs.
|
51 |
+
![training_loss.png](https://huggingface.co/XeTute/Phantasor_V0.2-137M/resolve/main/training_loss.png)
|
52 |
+
Instead of AdamW, which is often used for large GPTs, we used **SGD**, which enabled the model to generalize better, which can be seen when using the model on non-dataset prompts.
|
53 |
+
|
54 |
+
## Finished Model
|
55 |
+
- ~137M Parameters, all of which are trainable
|
56 |
+
- 1024 / 1k input tokens / context length, from which all were used
|
57 |
+
- A loss ~1.2 on all samples (see Files => train_results.json)
|
58 |
+
|
59 |
+
This is very good performance for the V0.2.
|
60 |
+
|
61 |
+
# Our platforms
|
62 |
+
## Socials
|
63 |
+
[BlueSky](https://bsky.app/profile/xetute.bsky.social) | [YouTube](https://www.youtube.com/@XeTuteTechnologies) | [HuggingFace 🤗](https://huggingface.co/XeTute) | [Ko-Fi / Financially Support Us](https://ko-fi.com/XeTute)
|
64 |
+
|
65 |
+
## Our Platforms
|
66 |
+
[Our Webpage](https://xetute.com) | [PhantasiaAI](https://xetute.com/PhantasiaAI)
|
67 |
+
|
68 |
+
Have a great day!
|