--- language: - en license: cc-by-nc-4.0 pipeline_tag: text-generation model-index: - name: quantum-dpo-v0.1 results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 72.53 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 88.37 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 65.29 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 69.92 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 82.32 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 70.81 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1 name: Open LLM Leaderboard --- # quantumaikr/quantum-dpo-v0.1 ## Usage Start chatting with `quantumaikr/quantum-dpo-v0.1` using the following code snippet: ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline tokenizer = AutoTokenizer.from_pretrained("quantumaikr/quantum-dpo-v0.1") model = AutoModelForCausalLM.from_pretrained("quantumaikr/quantum-dpo-v0.1", torch_dtype=torch.float16, device_map="auto") system_prompt = "You are QuantumLM, an AI that follows instructions extremely well. Help as much as you can. Remember, be safe, and don't do anything illegal." message = "Write me a poem please" prompt = f"[INST] <>\n{system_prompt}\n<>\n\n{message}[/INST]" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") output = model.generate(**inputs, do_sample=True, temperature=0.7, top_p=0.95, top_k=30, max_new_tokens=2048) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` QuantumLM should be used with this prompt format: ``` ### System: This is a system prompt, please behave and help the user. ### User: Your prompt here ### Assistant The output of QuantumLM ``` ## Use and Limitations ### Intended Use These models are intended for research only, in adherence with the [CC BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/) license. ### Limitations and bias Although the aforementioned dataset helps to steer the base language models into "safer" distributions of text, not all biases and toxicity can be mitigated through fine-tuning. We ask that users be mindful of such potential issues that can arise in generated responses. Do not treat model outputs as substitutes for human judgment or as sources of truth. Please use it responsibly. Contact us : hi@quantumai.kr # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_quantumaikr__quantum-dpo-v0.1) | Metric |Value| |---------------------------------|----:| |Avg. |74.87| |AI2 Reasoning Challenge (25-Shot)|72.53| |HellaSwag (10-Shot) |88.37| |MMLU (5-Shot) |65.29| |TruthfulQA (0-shot) |69.92| |Winogrande (5-shot) |82.32| |GSM8k (5-shot) |70.81|