Apel-sin commited on
Commit
08fb789
·
1 Parent(s): 41b4396

add measurement.json

Browse files
Files changed (2) hide show
  1. README.md +100 -0
  2. measurement.json +0 -0
README.md ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: abacusai/Dracarys2-72B-Instruct
3
+ language:
4
+ - en
5
+ license: other
6
+ license_name: tongyi-qianwen
7
+ license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
8
+ pipeline_tag: text-generation
9
+ tags:
10
+ - chat
11
+ quantized_by: Apel-sin
12
+ ---
13
+ # Dracarys2-72B-Instruct
14
+
15
+ # Introduction
16
+
17
+ We introduce the latest in the Smaug series, the Dracarys family of finetunes targeting coding performance improvements
18
+ across a variety of base models.
19
+
20
+ This variant is a finetune of [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
21
+
22
+ Compared to Qwen2.5-72B-Instruct, Dracarys has better LiveCodeBench scores (see evaluation results below).
23
+
24
+ ### Model Description
25
+
26
+ - **Developed by:** [Abacus.AI](https://abacus.ai)
27
+ - **License:** https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
28
+ - **Finetuned from model:** [Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct).
29
+
30
+ ## How to use
31
+
32
+ The prompt format is unchanged from Qwen2.5-72B-Instruct (see evaluations for prompt details for LCB)
33
+
34
+ ### Use with transformers
35
+
36
+ See the snippet below for usage with Transformers:
37
+
38
+ ```python
39
+ import transformers
40
+ import torch
41
+
42
+ model_id = "abacusai/Dracarys2-72B-Instruct"
43
+
44
+ pipeline = transformers.pipeline(
45
+ "text-generation",
46
+ model=model_id,
47
+ model_kwargs={"torch_dtype": torch.bfloat16},
48
+ device_map="auto",
49
+ )
50
+
51
+ messages = [
52
+ {"role": "system", "content": "You are data science coding assistant that generates Python code using Pandas and Numpy."},
53
+ {"role": "user", "content": "Write code to select rows from the dataframe `df` having the maximum `temp` for each `city`"},
54
+ ]
55
+
56
+ prompt = pipeline.tokenizer.apply_chat_template(
57
+ messages,
58
+ tokenize=False,
59
+ add_generation_prompt=True
60
+ )
61
+
62
+ terminators = [
63
+ pipeline.tokenizer.eos_token_id,
64
+ pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
65
+ ]
66
+
67
+ outputs = pipeline(
68
+ prompt,
69
+ max_new_tokens=256,
70
+ eos_token_id=terminators,
71
+ do_sample=True,
72
+ temperature=0.6,
73
+ top_p=0.9,
74
+ )
75
+ print(outputs[0]["generated_text"][len(prompt):])
76
+ ```
77
+
78
+ # Evaluation Results
79
+
80
+
81
+ ## LiveCodeBench
82
+
83
+ | Model | Code Generation | Code Execution (COT) |Test Output Prediction |
84
+ |----------------------------|-----------------|----------------------|-----------------------|
85
+ | **Dracarys2-72B-Instruct** | **53.80** | **89.12** | **59.61** |
86
+ | Qwen2.5-72B-Instruct | 53.03 | 88.72 | 46.28 |
87
+
88
+ ## Breakdown of LiveCodeBench CodeGeneration
89
+
90
+ | Model | Easy | Medium | Hard |
91
+ |---------------------------|-----------------|----------------|---------------|
92
+ | **Dracarys2-72B-Instruct**| **88.79** | **50.28** | 9.47 |
93
+ | Qwen2.5-72B-Instruct | 86.99 | 49.59 | 9.99 |
94
+
95
+ ## Breakdown of LiveCodeBench TestOutputPrediction
96
+
97
+ | Model | Easy | Medium | Hard |
98
+ |---------------------------|-----------------|----------------|-----------------------|
99
+ | **Dracarys2-72B-Instruct**| **79.25** | **53.76** | **37.63** |
100
+ | Qwen2.5-72B-Instruct | 68.43 | 39.46 | 22.22 |
measurement.json ADDED
The diff for this file is too large to render. See raw diff