leaderboard-pt-pr-bot commited on
Commit
72a3887
·
verified ·
1 Parent(s): 8f9e9c8

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +172 -7
README.md CHANGED
@@ -1,17 +1,164 @@
1
  ---
2
- library_name: transformers
3
- license: apache-2.0
4
- datasets:
5
- - rhaymison/orca-math-portuguese-64k
6
  language:
7
  - pt
8
- pipeline_tag: text-generation
9
- base_model: rhaymison/Mistral-portuguese-luana-7b
10
  tags:
11
  - portuguese
12
  - math
13
  - mathematics
14
  - matematica
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ---
16
 
17
  # Mistral-portuguese-luana-7b-Mathematics
@@ -143,4 +290,22 @@ email: [email protected]
143
  </a>
144
  <a href="https://github.com/rhaymisonbetini" target="_blank">
145
  <img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white">
146
- </a>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
 
2
  language:
3
  - pt
4
+ license: apache-2.0
5
+ library_name: transformers
6
  tags:
7
  - portuguese
8
  - math
9
  - mathematics
10
  - matematica
11
+ base_model: rhaymison/Mistral-portuguese-luana-7b
12
+ datasets:
13
+ - rhaymison/orca-math-portuguese-64k
14
+ pipeline_tag: text-generation
15
+ model-index:
16
+ - name: Mistral-portuguese-luana-7b-Mathematics
17
+ results:
18
+ - task:
19
+ type: text-generation
20
+ name: Text Generation
21
+ dataset:
22
+ name: ENEM Challenge (No Images)
23
+ type: eduagarcia/enem_challenge
24
+ split: train
25
+ args:
26
+ num_few_shot: 3
27
+ metrics:
28
+ - type: acc
29
+ value: 56.68
30
+ name: accuracy
31
+ source:
32
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
33
+ name: Open Portuguese LLM Leaderboard
34
+ - task:
35
+ type: text-generation
36
+ name: Text Generation
37
+ dataset:
38
+ name: BLUEX (No Images)
39
+ type: eduagarcia-temp/BLUEX_without_images
40
+ split: train
41
+ args:
42
+ num_few_shot: 3
43
+ metrics:
44
+ - type: acc
45
+ value: 45.9
46
+ name: accuracy
47
+ source:
48
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
49
+ name: Open Portuguese LLM Leaderboard
50
+ - task:
51
+ type: text-generation
52
+ name: Text Generation
53
+ dataset:
54
+ name: OAB Exams
55
+ type: eduagarcia/oab_exams
56
+ split: train
57
+ args:
58
+ num_few_shot: 3
59
+ metrics:
60
+ - type: acc
61
+ value: 37.9
62
+ name: accuracy
63
+ source:
64
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
65
+ name: Open Portuguese LLM Leaderboard
66
+ - task:
67
+ type: text-generation
68
+ name: Text Generation
69
+ dataset:
70
+ name: Assin2 RTE
71
+ type: assin2
72
+ split: test
73
+ args:
74
+ num_few_shot: 15
75
+ metrics:
76
+ - type: f1_macro
77
+ value: 89.36
78
+ name: f1-macro
79
+ source:
80
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
81
+ name: Open Portuguese LLM Leaderboard
82
+ - task:
83
+ type: text-generation
84
+ name: Text Generation
85
+ dataset:
86
+ name: Assin2 STS
87
+ type: eduagarcia/portuguese_benchmark
88
+ split: test
89
+ args:
90
+ num_few_shot: 15
91
+ metrics:
92
+ - type: pearson
93
+ value: 74.78
94
+ name: pearson
95
+ source:
96
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
97
+ name: Open Portuguese LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: FaQuAD NLI
103
+ type: ruanchaves/faquad-nli
104
+ split: test
105
+ args:
106
+ num_few_shot: 15
107
+ metrics:
108
+ - type: f1_macro
109
+ value: 74.87
110
+ name: f1-macro
111
+ source:
112
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
113
+ name: Open Portuguese LLM Leaderboard
114
+ - task:
115
+ type: text-generation
116
+ name: Text Generation
117
+ dataset:
118
+ name: HateBR Binary
119
+ type: ruanchaves/hatebr
120
+ split: test
121
+ args:
122
+ num_few_shot: 25
123
+ metrics:
124
+ - type: f1_macro
125
+ value: 76.39
126
+ name: f1-macro
127
+ source:
128
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
129
+ name: Open Portuguese LLM Leaderboard
130
+ - task:
131
+ type: text-generation
132
+ name: Text Generation
133
+ dataset:
134
+ name: PT Hate Speech Binary
135
+ type: hate_speech_portuguese
136
+ split: test
137
+ args:
138
+ num_few_shot: 25
139
+ metrics:
140
+ - type: f1_macro
141
+ value: 67.46
142
+ name: f1-macro
143
+ source:
144
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
145
+ name: Open Portuguese LLM Leaderboard
146
+ - task:
147
+ type: text-generation
148
+ name: Text Generation
149
+ dataset:
150
+ name: tweetSentBR
151
+ type: eduagarcia/tweetsentbr_fewshot
152
+ split: test
153
+ args:
154
+ num_few_shot: 25
155
+ metrics:
156
+ - type: f1_macro
157
+ value: 49.03
158
+ name: f1-macro
159
+ source:
160
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Mistral-portuguese-luana-7b-Mathematics
161
+ name: Open Portuguese LLM Leaderboard
162
  ---
163
 
164
  # Mistral-portuguese-luana-7b-Mathematics
 
290
  </a>
291
  <a href="https://github.com/rhaymisonbetini" target="_blank">
292
  <img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white">
293
+ </a>
294
+
295
+ # Open Portuguese LLM Leaderboard Evaluation Results
296
+
297
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/rhaymison/Mistral-portuguese-luana-7b-Mathematics) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
298
+
299
+ | Metric | Value |
300
+ |--------------------------|--------|
301
+ |Average |**63.6**|
302
+ |ENEM Challenge (No Images)| 56.68|
303
+ |BLUEX (No Images) | 45.90|
304
+ |OAB Exams | 37.90|
305
+ |Assin2 RTE | 89.36|
306
+ |Assin2 STS | 74.78|
307
+ |FaQuAD NLI | 74.87|
308
+ |HateBR Binary | 76.39|
309
+ |PT Hate Speech Binary | 67.46|
310
+ |tweetSentBR | 49.03|
311
+