model card (#2)
Browse files- model card (6e9f0120c770ab0673fa0e7e8167944760c122dc)
- Update README.md (3eb300a386706c34488f0a8574069914612da60f)
- Update README.md (fe76a6bdece1f287e898501a542feafb66e3cfb4)
- banners (f31f0417ad879322b60b7f12af8afefe04ccd840)
Co-authored-by: scott <[email protected]>
README.md
CHANGED
@@ -1,3 +1,68 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
pipeline_tag: text-generation
|
5 |
+
tags:
|
6 |
+
- esper
|
7 |
+
- dev-ops
|
8 |
+
- developer
|
9 |
+
- code
|
10 |
+
- code-instruct
|
11 |
+
- valiant
|
12 |
+
- valiant-labs
|
13 |
+
- code-llama
|
14 |
+
- llama
|
15 |
+
- llama-2
|
16 |
+
- llama-2-chat
|
17 |
+
- 70b
|
18 |
+
model_type: llama
|
19 |
license: apache-2.0
|
20 |
---
|
21 |
+
|
22 |
+
|
23 |
+
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64f267a8a4f79a118e0fcc89/4I6oK8DG0so4VD8GroFsd.jpeg)
|
24 |
+
|
25 |
+
Esper-70b is the DevOps code specialist!
|
26 |
+
- Overall code capabilities with a DevOps focus: specialized in scripting language code, Terraform files, Dockerfiles, YAML, and more!
|
27 |
+
- Also trained on further code-instruct and chat-instruct data for generally improved chat quality.
|
28 |
+
- Built on llama-2-70b architecture, using [CodeLlama-70b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf) as the base model.
|
29 |
+
|
30 |
+
(If you're looking for a friendly general-purpose chat model, try ours: [llama-13b](https://huggingface.co/ValiantLabs/ShiningValiantXS) and [70b](https://huggingface.co/ValiantLabs/ShiningValiant))
|
31 |
+
|
32 |
+
## Version
|
33 |
+
|
34 |
+
This is Version **1.0** of Esper-70b.
|
35 |
+
|
36 |
+
The current version of Esper-70b uses [CodeLlama-70b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf) trained on two sets of data:
|
37 |
+
- code from [bigcode/the-stack-dedup](https://huggingface.co/datasets/bigcode/the-stack-dedup), with our sub-selection focused on scripting languages, Terraform/build scripts, and YAML files.
|
38 |
+
- our private data for general code-instruct performance, chat-quality response, and user satisfaction. (A portion of this data was also used in [Shining Valiant 1.4](https://huggingface.co/ValiantLabs/ShiningValiant), our previous general-purpose Llama 70b finetune.)
|
39 |
+
|
40 |
+
Esper-70b is the newest release in our Build Tools campaign, to deliver helpful open source capabilities for users and creators. We're working on more tools to come! For everyone to use :)
|
41 |
+
|
42 |
+
We're planning on continually upgrading this model with more data, to improve existing capabilities and add new ones relevant to a DevOps user base.
|
43 |
+
|
44 |
+
## Prompting Guide
|
45 |
+
Esper-70b uses the following recommended chat format, based on CodeLlama-70b chat format:
|
46 |
+
|
47 |
+
|
48 |
+
Source: system\n\n You are Esper, an expert technical assistant AI. Provide high quality code to the user. <step> Source: user\n\n Hi! Can you explain this Terraform code, thank you:
|
49 |
+
|
50 |
+
|
51 |
+
(Generally, anything that works with [CodeLlama-70b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-70b-Instruct-hf) will work with Esper-70b.)
|
52 |
+
|
53 |
+
|
54 |
+
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63444f2687964b331809eb55/VCJ8Fmefd8cdVhXSSxJiD.jpeg)
|
55 |
+
|
56 |
+
|
57 |
+
Esper-70b is created by [Valiant Labs.](http://valiantlabs.ca/)
|
58 |
+
|
59 |
+
Try our flagship chat model, [Shining Valiant!](https://huggingface.co/ValiantLabs/ShiningValiant)
|
60 |
+
|
61 |
+
Check out our function-calling model [Fireplace](https://huggingface.co/ValiantLabs/Fireplace-13b) for Llama-13b!
|
62 |
+
|
63 |
+
[Follow us on X for updates on our models!](https://twitter.com/valiant_labs)
|
64 |
+
|
65 |
+
We care about open source.
|
66 |
+
For everyone to use.
|
67 |
+
|
68 |
+
We encourage others to finetune further from our models.
|