|
--- |
|
base_model: |
|
- beomi/Llama-3-KoEn-8B-Instruct-preview |
|
- Danielbrdz/Barcenas-Llama3-8b-ORPO |
|
- maum-ai/Llama-3-MAAL-8B-Instruct-v0.1 |
|
- rombodawg/Llama-3-8B-Instruct-Coder |
|
- NousResearch/Meta-Llama-3-8B-Instruct |
|
- rombodawg/Llama-3-8B-Base-Coder-v3.5-10k |
|
- cognitivecomputations/dolphin-2.9-llama3-8b |
|
- asiansoul/Llama-3-Open-Ko-Linear-8B |
|
- NousResearch/Meta-Llama-3-8B |
|
- aaditya/Llama3-OpenBioLLM-8B |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
|
|
--- |
|
# π· Joah-Llama-3-KoEn-8B-Coder-v1 |
|
|
|
<a href="https://ibb.co/2Srsmn7"><img src="https://i.ibb.co/f9WnB1Y/Screenshot-2024-05-11-at-7-15-42-PM.png" alt="Screenshot-2024-05-11-at-7-15-42-PM" border="0"></a> |
|
|
|
μ€λ λΆν° μλ‘μκ² λΉμ΄ λμ΄ μ€ μ¬λ¬λΆμ Merge Model |
|
|
|
"μ’μ(Joah)" by AsianSoul |
|
|
|
Soon Multi Language Model Merge based on this. First German Start (Korean / English / German) π |
|
|
|
Where to use Joah : Medical, Korean, English, Translation, Code, Science... π₯ |
|
|
|
## π‘ Merge Details |
|
|
|
|
|
The performance of this merge model doesn't seem to be bad though.-> Just opinion^^ ποΈ |
|
|
|
This may not be a model that satisfies you. But if we continue to overcome our shortcomings, |
|
|
|
Won't we someday find the answer we want? |
|
|
|
Don't worry even if you don't get the results you want. |
|
|
|
I'll find the answer for you. |
|
|
|
Soon real PoSE to extend Llama's context length to 64k with using my merge method : [reborn](https://medium.com/@puffanddmx82/reborn-elevating-model-adaptation-with-merging-for-superior-nlp-performance-f604e8e307b2) |
|
|
|
I have found that most of merge's model outside so far do not actually have 64k in their configs. I will improve it in the next merge with my reborn. If that doesn't work, I guess I'll have to find another way, right? |
|
|
|
256k is not possible. My computer is running out of memory. |
|
|
|
If you support me, i will try it on a computer with maximum specifications, also, i would like to conduct great tests by building a network with high-capacity traffic and high-speed 10G speeds for you. |
|
|
|
|
|
### π§Ά Merge Method |
|
|
|
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base. |
|
|
|
### π Models Merged |
|
|
|
The following models were included in the merge: |
|
* [beomi/Llama-3-KoEn-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-Instruct-preview) |
|
* [Danielbrdz/Barcenas-Llama3-8b-ORPO](https://huggingface.co/Danielbrdz/Barcenas-Llama3-8b-ORPO) |
|
* [maum-ai/Llama-3-MAAL-8B-Instruct-v0.1](https://huggingface.co/maum-ai/Llama-3-MAAL-8B-Instruct-v0.1) |
|
* [rombodawg/Llama-3-8B-Instruct-Coder](https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder) |
|
* [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) |
|
* [rombodawg/Llama-3-8B-Base-Coder-v3.5-10k](https://huggingface.co/rombodawg/Llama-3-8B-Base-Coder-v3.5-10k) |
|
* [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b) |
|
* [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B) |
|
* [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B) |
|
|
|
### πΉ Ollama |
|
|
|
Modelfile_Q5_K_M |
|
``` |
|
FROM joah-llama-3-koen-8b-coder-v1-Q5_K_M.gguf |
|
TEMPLATE """ |
|
{{- if .System }} |
|
system |
|
<s>{{ .System }}</s> |
|
{{- end }} |
|
user |
|
<s>Human: |
|
{{ .Prompt }}</s> |
|
assistant |
|
<s>Assistant: |
|
""" |
|
|
|
SYSTEM """ |
|
μΉμ ν μ±λ΄μΌλ‘μ μλλ°©μ μμ²μ μ΅λν μμΈνκ³ μΉμ νκ² λ΅νμ. λͺ¨λ λλ΅μ νκ΅μ΄(Korean)μΌλ‘ λλ΅ν΄μ€. |
|
""" |
|
|
|
PARAMETER temperature 0.7 |
|
PARAMETER num_predict 3000 |
|
PARAMETER num_ctx 4096 |
|
PARAMETER stop "<s>" |
|
PARAMETER stop "</s>" |
|
``` |
|
|
|
``` |
|
ollama create joah -f ./Modelfile_Q5_K_M |
|
``` |
|
|
|
Modelfile_Q5_K_M default, i hope you to test many upload file for my repo to change that and create ollama |
|
|
|
|
|
### π Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
models: |
|
- model: NousResearch/Meta-Llama-3-8B |
|
# Base model providing a general foundation without specific parameters |
|
|
|
- model: NousResearch/Meta-Llama-3-8B-Instruct |
|
parameters: |
|
density: 0.60 |
|
weight: 0.25 |
|
|
|
- model: beomi/Llama-3-KoEn-8B-Instruct-preview |
|
parameters: |
|
density: 0.55 |
|
weight: 0.15 |
|
|
|
- model: asiansoul/Llama-3-Open-Ko-Linear-8B |
|
parameters: |
|
density: 0.55 |
|
weight: 0.2 |
|
|
|
- model: maum-ai/Llama-3-MAAL-8B-Instruct-v0.1 |
|
parameters: |
|
density: 0.55 |
|
weight: 0.1 |
|
|
|
- model: rombodawg/Llama-3-8B-Instruct-Coder |
|
parameters: |
|
density: 0.55 |
|
weight: 0.1 |
|
|
|
- model: rombodawg/Llama-3-8B-Base-Coder-v3.5-10k |
|
parameters: |
|
density: 0.55 |
|
weight: 0.1 |
|
|
|
- model: cognitivecomputations/dolphin-2.9-llama3-8b |
|
parameters: |
|
density: 0.55 |
|
weight: 0.05 |
|
|
|
- model: Danielbrdz/Barcenas-Llama3-8b-ORPO |
|
parameters: |
|
density: 0.55 |
|
weight: 0.05 |
|
|
|
- model: aaditya/Llama3-OpenBioLLM-8B |
|
parameters: |
|
density: 0.55 |
|
weight: 0.1 |
|
|
|
merge_method: dare_ties |
|
base_model: NousResearch/Meta-Llama-3-8B |
|
parameters: |
|
int8_mask: true |
|
dtype: bfloat16 |
|
|
|
|
|
``` |