neph1 commited on
Commit
8ed8b7c
1 Parent(s): fc637b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -7
README.md CHANGED
@@ -8,12 +8,14 @@ language:
8
 
9
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/653cd3049107029eb004f968/pLcriXAfp3Y9Z0RGwwVUB.png)
10
 
 
 
11
  Updated 240112: Bigger dataset. Validation set. rank/alpha: 16/32. 2k context length. Please note that unquantized version is NOT updated.
12
 
13
- Qlora trained for 2 epochs on 9600 rows of q&a from around 1300 pages from wikipedia + around 100 of python questions and examples from
14
  neph1/Alpaca-Lora-GPT4-Swedish-Refined (because I had spent so much time cleaning them and didn't want to throw them away). Also a couple of hundred rows of manually
15
  gathered examples and some generated using chat-gpt.
16
- Dataset otherwise generated using gpt-3.5-turbo.
17
 
18
  The goal is to improve knowledge in Swedish topics, while improving the quality of the language.
19
 
@@ -24,20 +26,22 @@ As with any bard, what this model says should be taken with a grain of salt. Eve
24
 
25
  Configuration:
26
 
27
- Rank: 16
28
 
29
- Alpha: 32
30
 
31
  Dropout: 0.1
32
 
33
- Learning rate: 3e-5
34
 
35
  Context length: 2048
36
 
37
- Prompt format: ```[INST] Hur bakar jag sockerkaka?[/INST]```
 
 
38
 
39
 
40
- An absolutely beautiful example (first try). Sadly it's not always as good. (gguf q8, temp: 0.7, llama.cpp):
41
  ```
42
  User: Vem är statsminister i Sverige?
43
 
 
8
 
9
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/653cd3049107029eb004f968/pLcriXAfp3Y9Z0RGwwVUB.png)
10
 
11
+ Updated 240124: Dataset: 11300 rows. Rank: 32/64. Included a set of "summarize" tasks and longer "essay" style input. The dataset for the 240112 update had about 2000 duplicated rows, sadly.
12
+
13
  Updated 240112: Bigger dataset. Validation set. rank/alpha: 16/32. 2k context length. Please note that unquantized version is NOT updated.
14
 
15
+ Qlora trained for 2 epochs on 11300 rows of q&a + around 100 of python questions and examples from
16
  neph1/Alpaca-Lora-GPT4-Swedish-Refined (because I had spent so much time cleaning them and didn't want to throw them away). Also a couple of hundred rows of manually
17
  gathered examples and some generated using chat-gpt.
18
+ Dataset otherwise generated using gpt-3.5-turbo and Mixtral 8x7b (about on third).
19
 
20
  The goal is to improve knowledge in Swedish topics, while improving the quality of the language.
21
 
 
26
 
27
  Configuration:
28
 
29
+ Rank: 32
30
 
31
+ Alpha: 64
32
 
33
  Dropout: 0.1
34
 
35
+ Learning rate (at start): 2e-5
36
 
37
  Context length: 2048
38
 
39
+ Training length: ca 2.1 epochs
40
+
41
+ Prompt format: ```[INST]Hur bakar jag sockerkaka?[/INST]```
42
 
43
 
44
+ Example (240112 version). Sadly it's not always as good. (gguf q8, temp: 0.7, llama.cpp):
45
  ```
46
  User: Vem är statsminister i Sverige?
47