Adding the Open Portuguese LLM Leaderboard Evaluation Results (#1)

f3e03b9 verified 9 months ago

5.93 kB

	---
	license: mit
	model-index:
	- name: phibode_1_5_ultraalpaca
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: ENEM Challenge (No Images)
	type: eduagarcia/enem_challenge
	split: train
	args:
	num_few_shot: 3
	metrics:
	- type: acc
	value: 23.58
	name: accuracy
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BLUEX (No Images)
	type: eduagarcia-temp/BLUEX_without_images
	split: train
	args:
	num_few_shot: 3
	metrics:
	- type: acc
	value: 20.72
	name: accuracy
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: OAB Exams
	type: eduagarcia/oab_exams
	split: train
	args:
	num_few_shot: 3
	metrics:
	- type: acc
	value: 24.87
	name: accuracy
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Assin2 RTE
	type: assin2
	split: test
	args:
	num_few_shot: 15
	metrics:
	- type: f1_macro
	value: 69.07
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Assin2 STS
	type: eduagarcia/portuguese_benchmark
	split: test
	args:
	num_few_shot: 15
	metrics:
	- type: pearson
	value: 4.94
	name: pearson
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: FaQuAD NLI
	type: ruanchaves/faquad-nli
	split: test
	args:
	num_few_shot: 15
	metrics:
	- type: f1_macro
	value: 43.97
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HateBR Binary
	type: ruanchaves/hatebr
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: f1_macro
	value: 34.94
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: PT Hate Speech Binary
	type: hate_speech_portuguese
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: f1_macro
	value: 41.23
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: tweetSentBR
	type: eduagarcia/tweetsentbr_fewshot
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: f1_macro
	value: 24.19
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/phibode_1_5_ultraalpaca
	name: Open Portuguese LLM Leaderboard
	---
	# Phi-Bode


	<!--- PROJECT LOGO -->
	<p align="center">
	<img src="https://huggingface.co/recogna-nlp/Phi-Bode/resolve/main/phi-bode.jpg" alt="Phi-Bode Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
	</p>

	Phi-Bode é um modelo de linguagem ajustado para o idioma português, desenvolvido a partir do modelo base Phi-1.5B fornecido pela [Microsoft](https://huggingface.co/microsoft/phi-1_5). Este modelo foi refinado através do processo de fine-tuning utilizando o dataset UltraAlpaca. O principal objetivo deste modelo é ser viável para pessoas
	que não possuem recursos computacionais disponíveis para o uso de LLMs (Large Language Models). Ressalta-se que este é um trabalho em andamento e o modelo ainda apresenta problemas na geração de texto em português.

	## Características Principais

	- Modelo Base: Phi-1.5B, criado pela Microsoft, com 1.3 bilhões de parâmetros.
	- Dataset para Fine-tuning: UltraAlpaca
	- Treinamento: O treinamento foi realizado a partir do fine-tuning completo do phi-1.5.


	# Open Portuguese LLM Leaderboard Evaluation Results

	Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/recogna-nlp/phibode_1_5_ultraalpaca) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)

	\| Metric \| Value \|
	\|--------------------------\|---------\|
	\|Average \|31.95\|
	\|ENEM Challenge (No Images)\| 23.58\|
	\|BLUEX (No Images) \| 20.72\|
	\|OAB Exams \| 24.87\|
	\|Assin2 RTE \| 69.07\|
	\|Assin2 STS \| 4.94\|
	\|FaQuAD NLI \| 43.97\|
	\|HateBR Binary \| 34.94\|
	\|PT Hate Speech Binary \| 41.23\|
	\|tweetSentBR \| 24.19\|