Adding the Open Portuguese LLM Leaderboard Evaluation Results

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show

README.md +168 -2

README.md CHANGED Viewed

@@ -1,5 +1,4 @@
 ---
-license: apache-2.0
 language:
 - de
 - en
@@ -10,8 +9,156 @@ language:
 - ru
 - ar
 - es
 tags:
 - spectrum
 ---
 ![SauerkrautLM-Nemo-12b-Instruct]( https://vago-solutions.ai/wp-content/uploads/2024/07/Sauerkraut-Nemo.png "SauerkrautLM-Nemo-12b-Instruct")
@@ -99,4 +246,23 @@ If you are interested in customized LLMs for business applications, please get i
 We are also keenly seeking support and investment for our startup, VAGO solutions where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us at [VAGO solutions](https://vago-solutions.ai)
 ## Acknowledgement
-Many thanks to [Mistral AI](https://huggingface.co/mistralai) for providing such a valuable model to the Open-Source community.

 ---
 language:
 - de
 - en
 - ru
 - ar
 - es
+license: apache-2.0
 tags:
 - spectrum
+model-index:
+- name: SauerkrautLM-Nemo-12b-Instruct
+  results:
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: ENEM Challenge (No Images)
+      type: eduagarcia/enem_challenge
+      split: train
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc
+      value: 70.05
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: BLUEX (No Images)
+      type: eduagarcia-temp/BLUEX_without_images
+      split: train
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc
+      value: 58.41
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: OAB Exams
+      type: eduagarcia/oab_exams
+      split: train
+      args:
+        num_few_shot: 3
+    metrics:
+    - type: acc
+      value: 52.53
+      name: accuracy
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Assin2 RTE
+      type: assin2
+      split: test
+      args:
+        num_few_shot: 15
+    metrics:
+    - type: f1_macro
+      value: 92.65
+      name: f1-macro
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: Assin2 STS
+      type: eduagarcia/portuguese_benchmark
+      split: test
+      args:
+        num_few_shot: 15
+    metrics:
+    - type: pearson
+      value: 75.99
+      name: pearson
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: FaQuAD NLI
+      type: ruanchaves/faquad-nli
+      split: test
+      args:
+        num_few_shot: 15
+    metrics:
+    - type: f1_macro
+      value: 83.18
+      name: f1-macro
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: HateBR Binary
+      type: ruanchaves/hatebr
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: f1_macro
+      value: 81.98
+      name: f1-macro
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: PT Hate Speech Binary
+      type: hate_speech_portuguese
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: f1_macro
+      value: 75.67
+      name: f1-macro
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
+  - task:
+      type: text-generation
+      name: Text Generation
+    dataset:
+      name: tweetSentBR
+      type: eduagarcia/tweetsentbr_fewshot
+      split: test
+      args:
+        num_few_shot: 25
+    metrics:
+    - type: f1_macro
+      value: 72.31
+      name: f1-macro
+    source:
+      url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct
+      name: Open Portuguese LLM Leaderboard
 ---
 ![SauerkrautLM-Nemo-12b-Instruct]( https://vago-solutions.ai/wp-content/uploads/2024/07/Sauerkraut-Nemo.png "SauerkrautLM-Nemo-12b-Instruct")
 We are also keenly seeking support and investment for our startup, VAGO solutions where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us at [VAGO solutions](https://vago-solutions.ai)
 ## Acknowledgement
+Many thanks to [Mistral AI](https://huggingface.co/mistralai) for providing such a valuable model to the Open-Source community.
+# Open Portuguese LLM Leaderboard Evaluation Results
+Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/VAGOsolutions/SauerkrautLM-Nemo-12b-Instruct) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
+|          Metric          |  Value  |
+|--------------------------|---------|
+|Average                   |**73.64**|
+|ENEM Challenge (No Images)|    70.05|
+|BLUEX (No Images)         |    58.41|
+|OAB Exams                 |    52.53|
+|Assin2 RTE                |    92.65|
+|Assin2 STS                |    75.99|
+|FaQuAD NLI                |    83.18|
+|HateBR Binary             |    81.98|
+|PT Hate Speech Binary     |    75.67|
+|tweetSentBR               |    72.31|