--- license: llama3 library_name: transformers tags: [] --- # Dracarys2-Llama-3.1-70B-Instruct ### Built with Meta Llama 3 # Introduction We introduce the latest in the Smaug series, the Dracarys family of finetunes targeting coding performance improvements across a variety of base models. This variant is a finetune of [meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) Compared to meta-llama/Meta-Llama-3.1-70B-Instruct, Dracarys has better LiveCodeBench scores (see evaluation results below). ### Model Description - **Developed by:** [Abacus.AI](https://abacus.ai) - **License:** https://llama.meta.com/llama3/license/ - **Finetuned from model:** [meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct). ## How to use The prompt format is unchanged from Llama 3 70B Instruct (see evaluations for prompt details for LCB) ### Use with transformers See the snippet below for usage with Transformers: ```python import transformers import torch model_id = "abacusai/Dracarys-72B-Instruct" pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto", ) messages = [ {"role": "system", "content": "You are data science coding assistant that generates Python code using Pandas and Numpy."}, {"role": "user", "content": "Write code to select rows from the dataframe `df` having the maximum `temp` for each `city`"}, ] prompt = pipeline.tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) terminators = [ pipeline.tokenizer.eos_token_id, pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>"), pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>"), ] outputs = pipeline( prompt, max_new_tokens=256, eos_token_id=terminators, do_sample=True, temperature=0.6, top_p=0.9, ) print(outputs[0]["generated_text"][len(prompt):]) ``` # Evaluation Results ## LiveCodeBench | Model | Code Generation | Code Execution |Test Output Prediction | |-------------------------------------|-----------------|----------------|-----------------------| | **Dracarys2-Llama-3.1-70B-Instruct**| **33.44** | 48.26 | **52.10** | | Meta-Llama-3.1-70B-Instruct | 32.23 | 48.768 | 41.40 | ## Breakdown of LiveCodeBench CodeGeneration | Model | Easy | Medium | Hard | |-------------------------------------|-----------------|----------------|-----------------------| | **Dracarys2-Llama-3.1-70B-Instruct**| **71.29** | **18.48** | **3.57** | | Meta-Llama-3.1-70B-Instruct | 68.4 | 17.99 | 3.57 | ## Breakdown of LiveCodeBench CodeExecution | Model | COT | Non-COT | |-------------------------------------|-----------------|----------------| | **Dracarys2-Llama-3.1-70B-Instruct**| **75.55** | 48.26 | | Meta-Llama-3.1-70B-Instruct | 70.14 | 48.768 | ## Breakdown of LiveCodeBench TestOutputPrediction | Model | Easy | Medium | Hard | |-------------------------------------|-----------------|----------------|-----------------------| | **Dracarys2-Llama-3.1-70B-Instruct**| **63.53** | **47.30** | **43.61** | | Meta-Llama-3.1-70B-Instruct | 51.22 | 35.91 | 34.30 | ## LiveBench(Aug update) | Model | Global Average | Coding Average | Reasoning Average| Mathematics Average | Data Analysis Average | Language Average | IF Average | |-------------------------------------|----------------|----------------|------------------|---------------------|-----------------------|------------------|-------------| | **Dracarys2-Llama-3.1-70B-Instruct**| **47.8** | **36.3** | **47.3** | **38.9** | 46.1 | 41.5 | 76.6 | | Meta-Llama-3.1-70B-Instruct | 45.1 | 30.7 | 35.3 | 37.0 | 48.4 | 42.1 | 77.2 |