Update README.md
Browse files
README.md
CHANGED
@@ -18,30 +18,19 @@ pipeline_tag: text-generation
|
|
18 |
|
19 |
## Model Details
|
20 |
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
### Input
|
31 |
-
- Models solely process textual input.
|
32 |
-
|
33 |
-
### Output
|
34 |
-
- Models solely generate textual output.
|
35 |
-
|
36 |
-
### License
|
37 |
-
- This model is under a **Non-commercial** Bespoke License and governed by the Meta license. You should only use this repository if you have been granted access to the model by filling out [this form](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform), but have either lost your copy of the weights or encountered issues converting them to the Transformers format.
|
38 |
-
|
39 |
-
### Where to send comments
|
40 |
-
- Instructions on how to provide feedback or comments on a model can be found by opening an issue in the [Hugging Face community's model repository](https://huggingface.co/upstage/llama-30b-instruct-2048/discussions).
|
41 |
|
42 |
## Dataset Details
|
43 |
|
44 |
### Used Datasets
|
|
|
45 |
- [openbookqa](https://huggingface.co/datasets/openbookqa)
|
46 |
- [sciq](https://huggingface.co/datasets/sciq)
|
47 |
- [Open-Orca/OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca)
|
@@ -60,14 +49,10 @@ pipeline_tag: text-generation
|
|
60 |
{Assistant}
|
61 |
```
|
62 |
|
63 |
-
|
64 |
## Hardware and Software
|
65 |
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
### Training Factors
|
70 |
-
- We fine-tuned this model using a combination of the [DeepSpeed library](https://github.com/microsoft/DeepSpeed) and the [HuggingFace trainer](https://huggingface.co/docs/transformers/main_classes/trainer).
|
71 |
|
72 |
## Evaluation Results
|
73 |
|
|
|
18 |
|
19 |
## Model Details
|
20 |
|
21 |
+
* **Developed by**: [Upstage](https://en.upstage.ai)
|
22 |
+
* **Backbone Model**: [LLaMA](https://github.com/facebookresearch/llama/tree/llama_v1)
|
23 |
+
* **Variations**: It has different model parameter sizes and sequence lengths: [30B/1024](https://huggingface.co/upstage/llama-30b-instruct), [30B/2048](https://huggingface.co/upstage/llama-30b-instruct-2048), [65B/1024](https://huggingface.co/upstage/llama-65b-instruct)
|
24 |
+
* **Language(s)**: English
|
25 |
+
* **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
|
26 |
+
* **License**: This model is under a **Non-commercial** Bespoke License and governed by the Meta license. You should only use this repository if you have been granted access to the model by filling out [this form](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform), but have either lost your copy of the weights or encountered issues converting them to the Transformers format
|
27 |
+
* **Where to send comments**: Instructions on how to provide feedback or comments on a model can be found by opening an issue in the [Hugging Face community's model repository](https://huggingface.co/upstage/llama-30b-instruct-2048/discussions)
|
28 |
+
* **Contact**: For questions and comments about the model, please email `contact@upstage.ai`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
## Dataset Details
|
31 |
|
32 |
### Used Datasets
|
33 |
+
|
34 |
- [openbookqa](https://huggingface.co/datasets/openbookqa)
|
35 |
- [sciq](https://huggingface.co/datasets/sciq)
|
36 |
- [Open-Orca/OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca)
|
|
|
49 |
{Assistant}
|
50 |
```
|
51 |
|
|
|
52 |
## Hardware and Software
|
53 |
|
54 |
+
* **Hardware**: We utilized an A100x8 for training our model
|
55 |
+
* **Training Factors**: We fine-tuned this model using a combination of the [DeepSpeed library](https://github.com/microsoft/DeepSpeed) and the [HuggingFace trainer](https://huggingface.co/docs/transformers/main_classes/trainer)
|
|
|
|
|
|
|
56 |
|
57 |
## Evaluation Results
|
58 |
|