|
--- |
|
language: |
|
- en |
|
tags: |
|
- Llama-3 |
|
- instruct |
|
- finetune |
|
- chatml |
|
- DPO |
|
- RLHF |
|
- gpt4 |
|
- synthetic data |
|
- distillation |
|
- function calling |
|
- json mode |
|
- axolotl |
|
- llama-cpp |
|
- gguf-my-repo |
|
- LMEngine |
|
base_model: NousResearch/Meta-Llama-3-8B |
|
datasets: |
|
- teknium/OpenHermes-2.5 |
|
widget: |
|
- example_title: Hermes 2 Pro |
|
messages: |
|
- role: system |
|
content: >- |
|
You are a sentient, superintelligent artificial general intelligence, here |
|
to teach and assist me. |
|
- role: user |
|
content: >- |
|
Write a short story about Goku discovering kirby has teamed up with Majin |
|
Buu to destroy the world. |
|
model-index: |
|
- name: Hermes-2-Pro-Llama-3-8B |
|
results: [] |
|
--- |
|
|
|
# tinybiggames/Hermes-2-Pro-Llama-3-8B-Q4_K_M-GGUF |
|
This model was converted to GGUF format from [`NousResearch/Hermes-2-Pro-Llama-3-8B`](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space. |
|
Refer to the [original model card](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B) for more details on the model. |
|
## Use with tinyBigGAMES's [LMEngine Inference Library](https://github.com/tinyBigGAMES/LMEngine) |
|
|
|
|
|
How to configure LMEngine: |
|
|
|
```Delphi |
|
Config_Init( |
|
'C:/LLM/gguf', // path to model files |
|
-1 // number of GPU layer, -1 to use all available layers |
|
); |
|
``` |
|
|
|
How to define model: |
|
|
|
```Delphi |
|
Model_Define('hermes-2-pro-llama-3-8b.Q4_K_M.gguf', |
|
'hermes2pro:8B:Q4KM', 8000, '<|im_start|>{role}\n{content}<|im_end|>\n', |
|
'<|im_start|>assistant'); |
|
``` |
|
|
|
How to add a message: |
|
|
|
```Delphi |
|
Message_Add( |
|
ROLE_USER, // role |
|
'What is AI?' // content |
|
); |
|
``` |
|
|
|
`{role}` - will be substituted with the message "role" |
|
`{content}` - will be substituted with the message "content" |
|
|
|
How to do inference: |
|
|
|
```Delphi |
|
var |
|
LTokenOutputSpeed: Single; |
|
LInputTokens: Int32; |
|
LOutputTokens: Int32; |
|
LTotalTokens: Int32; |
|
|
|
if Inference_Run('hermes2pro:8B:Q4KM', 1024) then |
|
begin |
|
Inference_GetUsage(nil, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens, |
|
@LTotalTokens); |
|
Console_PrintLn('', FG_WHITE); |
|
Console_PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s', |
|
FG_BRIGHTYELLOW, LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed); |
|
end |
|
else |
|
begin |
|
Console_PrintLn('', FG_WHITE); |
|
Console_PrintLn('Error: %s', FG_RED, Error_Get()); |
|
end; |
|
``` |