metadata
language:
- en
license: cc-by-nc-4.0
pipeline_tag: text-generation
model-index:
- name: quantum-dpo-v0.1
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 72.53
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 88.37
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 65.29
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 69.92
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 82.32
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 70.81
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
name: Open LLM Leaderboard
quantumaikr/quantum-dpo-v0.1
Usage
Start chatting with quantumaikr/quantum-dpo-v0.1
using the following code snippet:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
tokenizer = AutoTokenizer.from_pretrained("quantumaikr/quantum-dpo-v0.1")
model = AutoModelForCausalLM.from_pretrained("quantumaikr/quantum-dpo-v0.1", torch_dtype=torch.float16, device_map="auto")
system_prompt = "You are QuantumLM, an AI that follows instructions extremely well. Help as much as you can. Remember, be safe, and don't do anything illegal."
message = "Write me a poem please"
prompt = f"[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{message}[/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, do_sample=True, temperature=0.7, top_p=0.95, top_k=30, max_new_tokens=2048)
print(tokenizer.decode(output[0], skip_special_tokens=True))
QuantumLM should be used with this prompt format:
### System:
This is a system prompt, please behave and help the user.
### User:
Your prompt here
### Assistant
The output of QuantumLM
Use and Limitations
Intended Use
These models are intended for research only, in adherence with the CC BY-NC-4.0 license.
Limitations and bias
Although the aforementioned dataset helps to steer the base language models into "safer" distributions of text, not all biases and toxicity can be mitigated through fine-tuning. We ask that users be mindful of such potential issues that can arise in generated responses. Do not treat model outputs as substitutes for human judgment or as sources of truth. Please use it responsibly.
Contact us : [email protected]
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 74.87 |
AI2 Reasoning Challenge (25-Shot) | 72.53 |
HellaSwag (10-Shot) | 88.37 |
MMLU (5-Shot) | 65.29 |
TruthfulQA (0-shot) | 69.92 |
Winogrande (5-shot) | 82.32 |
GSM8k (5-shot) | 70.81 |