---
language:
- en
license: cc-by-nc-4.0
pipeline_tag: text-generation
model-index:
- name: quantum-dpo-v0.1
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 72.53
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 88.37
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 65.29
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 69.92
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 82.32
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 70.81
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
      name: Open LLM Leaderboard
---

# quantumaikr/quantum-dpo-v0.1

## Usage

Start chatting with `quantumaikr/quantum-dpo-v0.1` using the following code snippet:

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained("quantumaikr/quantum-dpo-v0.1")
model = AutoModelForCausalLM.from_pretrained("quantumaikr/quantum-dpo-v0.1", torch_dtype=torch.float16, device_map="auto")

system_prompt = "You are QuantumLM, an AI that follows instructions extremely well. Help as much as you can. Remember, be safe, and don't do anything illegal."

message = "Write me a poem please"
prompt = f"[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{message}[/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, do_sample=True, temperature=0.7, top_p=0.95, top_k=30, max_new_tokens=2048)

print(tokenizer.decode(output[0], skip_special_tokens=True))
```

QuantumLM should be used with this prompt format:
```
### System:
This is a system prompt, please behave and help the user.

### User:
Your prompt here

### Assistant
The output of QuantumLM
```


## Use and Limitations

### Intended Use

These models are intended for research only, in adherence with the [CC BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/) license.

### Limitations and bias

Although the aforementioned dataset helps to steer the base language models into "safer" distributions of text, not all biases and toxicity can be mitigated through fine-tuning. We ask that users be mindful of such potential issues that can arise in generated responses. Do not treat model outputs as substitutes for human judgment or as sources of truth. Please use it responsibly.


Contact us : hi@quantumai.kr


# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_quantumaikr__quantum-dpo-v0.1)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |74.87|
|AI2 Reasoning Challenge (25-Shot)|72.53|
|HellaSwag (10-Shot)              |88.37|
|MMLU (5-Shot)                    |65.29|
|TruthfulQA (0-shot)              |69.92|
|Winogrande (5-shot)              |82.32|
|GSM8k (5-shot)                   |70.81|