KaraKaraWitch commited on
Commit
054c1c1
·
verified ·
1 Parent(s): 369c8cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -12
README.md CHANGED
@@ -11,30 +11,47 @@ library_name: transformers
11
  tags:
12
  - mergekit
13
  - merge
 
 
 
 
 
 
 
14
 
15
  ---
16
- # merge
17
-
18
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
19
 
20
- ## Merge Details
21
- ### Merge Method
22
 
23
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Fizzarolli/L3.1-70b-glitz-v0.2](https://huggingface.co/Fizzarolli/L3.1-70b-glitz-v0.2) as a base.
24
 
25
- ### Models Merged
26
 
27
- The following models were included in the merge:
28
  * [abacusai/Dracarys-Llama-3.1-70B-Instruct](https://huggingface.co/abacusai/Dracarys-Llama-3.1-70B-Instruct)
29
  * [Sao10K/L3-70B-Euryale-v2.1](https://huggingface.co/Sao10K/L3-70B-Euryale-v2.1)
30
  * [gbueno86/Cathallama-70B](https://huggingface.co/gbueno86/Cathallama-70B)
31
  * [sophosympatheia/New-Dawn-Llama-3.1-70B-v1.1](https://huggingface.co/sophosympatheia/New-Dawn-Llama-3.1-70B-v1.1)
32
  * [nothingiisreal/L3.1-70B-Celeste-V0.1-BF16](https://huggingface.co/nothingiisreal/L3.1-70B-Celeste-V0.1-BF16)
33
- * [cyberagent/Llama-3.1-70B-Japanese-Instruct-2407](https://huggingface.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407)
 
 
 
 
 
 
 
 
 
34
 
35
- ### Configuration
36
 
37
- The following YAML configuration was used to produce this model:
 
 
 
 
38
 
39
  ```yaml
40
 
@@ -52,5 +69,29 @@ base_model: Fizzarolli/L3.1-70b-glitz-v0.2
52
  parameters:
53
  normalize: true
54
  dtype: bfloat16
55
-
56
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  tags:
12
  - mergekit
13
  - merge
14
+ - abacusai/Dracarys-Llama-3.1-70B-Instruct
15
+ - Sao10K/L3-70B-Euryale-v2.1
16
+ - gbueno86/Cathallama-70B
17
+ - sophosympatheia/New-Dawn-Llama-3.1-70B-v1.1
18
+ - nothingiisreal/L3.1-70B-Celeste-V0.1-BF16
19
+ - Fizzarolli/L3.1-70b-glitz-v0.2
20
+ - cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
21
 
22
  ---
23
+ # KaraKaraWitch/L3.1-70b-Inori
 
 
24
 
25
+ Inori is the second 70b for the weekend for me to play around.
 
26
 
27
+ Learning from
28
 
29
+ ![](Inori.png)
30
 
31
+ L3.1-70b-Inori is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
32
  * [abacusai/Dracarys-Llama-3.1-70B-Instruct](https://huggingface.co/abacusai/Dracarys-Llama-3.1-70B-Instruct)
33
  * [Sao10K/L3-70B-Euryale-v2.1](https://huggingface.co/Sao10K/L3-70B-Euryale-v2.1)
34
  * [gbueno86/Cathallama-70B](https://huggingface.co/gbueno86/Cathallama-70B)
35
  * [sophosympatheia/New-Dawn-Llama-3.1-70B-v1.1](https://huggingface.co/sophosympatheia/New-Dawn-Llama-3.1-70B-v1.1)
36
  * [nothingiisreal/L3.1-70B-Celeste-V0.1-BF16](https://huggingface.co/nothingiisreal/L3.1-70B-Celeste-V0.1-BF16)
37
+ * [cyberagent/Llama-3.1-70B-Japanese-Instruct-2407](https://huggingface.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
38
+
39
+ 70b Inori takes a different approach by using Model Stock.
40
+
41
+ - Dracarys (I just threw it in, but can be useful for code)
42
+ - Euryale (You all know it!)
43
+ - Cathallama (Athene + turboderp_cat)
44
+ - New Dawn (I heard people like it so I added it in)
45
+ - Celeste (RP)
46
+ - Japanese-Instruct (Enhancement of Japanese Language for the weebs out there.)
47
 
48
+ No Hermes was harmed in the making of this model stock merge.
49
 
50
+ ## Yap / Chat Format
51
+
52
+ L3 Instruct.
53
+
54
+ ## 🧩 Configuration
55
 
56
  ```yaml
57
 
 
69
  parameters:
70
  normalize: true
71
  dtype: bfloat16
 
72
  ```
73
+
74
+ ## 💻 Usage
75
+
76
+ ```python
77
+ !pip install -qU transformers accelerate
78
+
79
+ from transformers import AutoTokenizer
80
+ import transformers
81
+ import torch
82
+
83
+ model = "KaraKaraWitch/L3.1-70b-Inori"
84
+ messages = [{"role": "user", "content": "What is a large language model?"}]
85
+
86
+ tokenizer = AutoTokenizer.from_pretrained(model)
87
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
88
+ pipeline = transformers.pipeline(
89
+ "text-generation",
90
+ model=model,
91
+ torch_dtype=torch.float16,
92
+ device_map="auto",
93
+ )
94
+
95
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
96
+ print(outputs[0]["generated_text"])
97
+ ```