Update README.md
Browse files
README.md
CHANGED
@@ -11,6 +11,53 @@ base_model: rinna/llama-3-youko-8b
|
|
11 |
pipeline_tag: translation
|
12 |
---
|
13 |
|
14 |
-
#
|
15 |
|
16 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
pipeline_tag: translation
|
12 |
---
|
13 |
|
14 |
+
# VNTL
|
15 |
|
16 |
+
This is an [LLaMA 3 Youko](https://huggingface.co/rinna/llama-3-youko-8b) qlora created using a slightly modified version of the [VNTL-v3.1-1k](https://huggingface.co/datasets/lmg-anon/VNTL-v3.1-1k) dataset, concatenated with the [VNTL-Chat](https://huggingface.co/datasets/lmg-anon/VNTL-Chat) dataset.
|
17 |
+
|
18 |
+
This was trained mostly with the same hyperparameters as the [VNTL 7B v0.3.1 lora](https://huggingface.co/lmg-anon/vntl-7b-v0.3.1-lora), the differences are:
|
19 |
+
- Added \["<\<METADATA>>", "<\<TRANSLATE>>", "<\<JAPANESE>>", "<\<ENGLISH>>", "<\<CHAT>>", "<\<HUMAN>>", "<\<LLM>>"\] as special tokens.
|
20 |
+
- Trained the \["embed_tokens", "lm_head"\] layers.
|
21 |
+
- 10x smaller learning rate, 0.00065 -> 0.000065.
|
22 |
+
|
23 |
+
This version also includes a new "chat mode", which was lazily trained just to find out how that would impact the end result. I think this ended up quite good for breaking down or explaining Japanese sentences, but it is terrible for most other things, which is expected given the nature of the VNTL-Chat dataset. To be honest, I wasn't aiming for it to be good, so the fact that it works at all is very nice.
|
24 |
+
|
25 |
+
**Eval Loss**: 0.8
|
26 |
+
|
27 |
+
## Translation Prompt
|
28 |
+
|
29 |
+
This is an prompt example for translation:
|
30 |
+
```
|
31 |
+
<<METADATA>>
|
32 |
+
[character] Name: Uryuu Shingo (ηη ζ°εΎ) | Gender: Male | Aliases: Onii-chan (γε
γ‘γγ)
|
33 |
+
[character] Name: Uryuu Sakuno (ηη ζ‘δΉ) | Gender: Female
|
34 |
+
<<TRANSLATE>>
|
35 |
+
<<JAPANESE>>
|
36 |
+
[ζ‘δΉ]: γβ¦β¦γγγγ
|
37 |
+
<<ENGLISH>>
|
38 |
+
[Sakuno]: γ... Sorry.γ<|end_of_text|>
|
39 |
+
<<JAPANESE>>
|
40 |
+
[ζ°εΎ]: γγγγγγγθ¨γ£γ‘γγͺγγ γγ©γθΏ·εγ§γγγ£γγγζ‘δΉγ―ε―ζγγγγγγγγεΏι
γγ‘γγ£γ¦γγγ γδΏΊγ
|
41 |
+
<<ENGLISH>>
|
42 |
+
```
|
43 |
+
|
44 |
+
The generated translation for that prompt, with temperature 0, is:
|
45 |
+
```
|
46 |
+
[Shingo]: γNo, I know this isn't the best way to say it, but I'm glad you got lost. You're so cute that I was worried about you.γ
|
47 |
+
```
|
48 |
+
|
49 |
+
## Chat Prompt
|
50 |
+
|
51 |
+
This is an prompt example for chat:
|
52 |
+
```
|
53 |
+
<<CHAT>>
|
54 |
+
<<HUMAN>>
|
55 |
+
Hi! Can you break this down to me? Like, what is the meaning of "ζ‘δΉγ―ε―ζγγγ"?
|
56 |
+
"γγγγγγθ¨γ£γ‘γγͺγγ γγ©γθΏ·εγ§γγγ£γγγζ‘δΉγ―ε―ζγγγγγγγγεΏι
γγ‘γγ£γ¦γγγ γδΏΊγ"
|
57 |
+
<<LLM>>
|
58 |
+
```
|
59 |
+
|
60 |
+
The generated response for that prompt, with temperature 0, is:
|
61 |
+
```
|
62 |
+
Certainly! The phrase "ζ‘δΉγ―ε―ζγγγ" literally translates to "Because Sakura-no is cute." In Japanese culture, there is a strong emphasis on physical attractiveness and cuteness, especially when it comes to young girls or women. This phrase suggests that the speaker finds Sakura-no (presumably a girl) to be very attractive and endearing in appearance.
|
63 |
+
```
|