yuriachermann
commited on
Commit
•
ba78638
1
Parent(s):
44d7432
Update README.md
Browse files
README.md
CHANGED
@@ -5,34 +5,43 @@ tags:
|
|
5 |
- trl
|
6 |
- sft
|
7 |
- generated_from_trainer
|
|
|
|
|
|
|
8 |
base_model: meta-llama/Llama-2-7b-hf
|
|
|
|
|
9 |
model-index:
|
10 |
- name: My_AGI_llama_2_7B
|
11 |
results: []
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
---
|
13 |
|
14 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
-
should probably proofread and complete it, then remove this comment. -->
|
16 |
|
17 |
# My_AGI_llama_2_7B
|
18 |
|
19 |
-
This model is a fine-tuned version of [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf) on an unknown dataset.
|
20 |
|
21 |
-
|
22 |
|
23 |
-
|
24 |
|
25 |
-
|
26 |
|
27 |
-
|
28 |
|
29 |
-
|
30 |
|
31 |
-
|
32 |
|
33 |
## Training procedure
|
34 |
|
35 |
-
### Training
|
36 |
|
37 |
The following hyperparameters were used during training:
|
38 |
- learning_rate: 1e-05
|
@@ -48,8 +57,51 @@ The following hyperparameters were used during training:
|
|
48 |
|
49 |
### Framework versions
|
50 |
|
51 |
-
- PEFT
|
52 |
-
- Transformers
|
53 |
-
- Pytorch
|
54 |
-
- Datasets
|
55 |
-
- Tokenizers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
- trl
|
6 |
- sft
|
7 |
- generated_from_trainer
|
8 |
+
- Dolly
|
9 |
+
- ipex
|
10 |
+
- Max Series GPU
|
11 |
base_model: meta-llama/Llama-2-7b-hf
|
12 |
+
datasets:
|
13 |
+
- databricks/databricks-dolly-15k
|
14 |
model-index:
|
15 |
- name: My_AGI_llama_2_7B
|
16 |
results: []
|
17 |
+
language:
|
18 |
+
- en
|
19 |
+
metrics:
|
20 |
+
- accuracy
|
21 |
+
- bertscore
|
22 |
+
- bleu
|
23 |
+
pipeline_tag: question-answering
|
24 |
---
|
25 |
|
|
|
|
|
26 |
|
27 |
# My_AGI_llama_2_7B
|
28 |
|
|
|
29 |
|
30 |
+
**Model Type:** Fine-Tuned
|
31 |
|
32 |
+
**Model Base:** [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
|
33 |
|
34 |
+
**Datasets Used:** [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k)
|
35 |
|
36 |
+
**Author:** [Yuri Achermann](https://huggingface.co/yuriachermann)
|
37 |
|
38 |
+
**Date:** June 03, 2024
|
39 |
|
40 |
+
-------------------------
|
41 |
|
42 |
## Training procedure
|
43 |
|
44 |
+
### Training Hyperparameters
|
45 |
|
46 |
The following hyperparameters were used during training:
|
47 |
- learning_rate: 1e-05
|
|
|
57 |
|
58 |
### Framework versions
|
59 |
|
60 |
+
- PEFT==0.11.1
|
61 |
+
- Transformers==4.41.2
|
62 |
+
- Pytorch==2.1.0.post0+cxx11.abi
|
63 |
+
- Datasets==2.19.2
|
64 |
+
- Tokenizers==0.19.1
|
65 |
+
|
66 |
+
-------------------------
|
67 |
+
|
68 |
+
## Intended uses & limitations
|
69 |
+
|
70 |
+
**Primary Use Case:** The model is intended for generating human-like responses in conversational applications, like chatbots or virtual assistants.
|
71 |
+
|
72 |
+
**Limitations:** The model may generate inaccurate or biased content as it reflects the data it was trained on. It is essential to evaluate the generated responses in context and use the model responsibly.
|
73 |
+
|
74 |
+
-------------------------
|
75 |
+
|
76 |
+
## Evaluation
|
77 |
+
|
78 |
+
The evaluation platform consists of Gaudi Accelerators and Xeon CPUs running benchmarks from the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness)
|
79 |
+
|
80 |
+
| Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande |
|
81 |
+
|:-------:|:-----:|:---------:|:-----:|:----------:|:----------:|
|
82 |
+
| 54.904 | 45.65 | 76.8 | 42.02 | 40.2 | 69.85 |
|
83 |
+
|
84 |
+
-------------------------
|
85 |
+
|
86 |
+
## Ethical Considerations
|
87 |
+
|
88 |
+
The model may inherit biases present in the training data. It is crucial to use the model in a way that promotes fairness and mitigates potential biases.
|
89 |
+
|
90 |
+
-------------------------
|
91 |
+
|
92 |
+
## Acknowledgments
|
93 |
+
|
94 |
+
This fine-tuning effort was made possible by the support of Intel, that provided the computing resources, and [Eduardo Alvarez](https://huggingface.co/eduardo-alvarez).
|
95 |
+
Additional shout-out to the creators of the Llama-2-7b-hf model and the contributors to the databricks-dolly-15k dataset.
|
96 |
+
|
97 |
+
-------------------------
|
98 |
+
|
99 |
+
## Contact Information
|
100 |
+
|
101 |
+
For questions or feedback about this model, please contact **[Yuri Achermann](mailto:[email protected])**.
|
102 |
+
|
103 |
+
-------------------------
|
104 |
+
|
105 |
+
## License
|
106 |
+
|
107 |
+
This model is distributed under **Apache 2.0 License**.
|