cabra13b / README.md

Fixing some errors of the leaderboard evaluation results in the ModelCard yaml

7da319b verified 7 months ago

5.25 kB

	---
	language:
	- pt
	- en
	license: other
	model-index:
	- name: cabra13b
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: ENEM Challenge (No Images)
	type: eduagarcia/enem_challenge
	split: train
	args:
	num_few_shot: 3
	metrics:
	- type: acc
	value: 48.85
	name: accuracy
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BLUEX (No Images)
	type: eduagarcia-temp/BLUEX_without_images
	split: train
	args:
	num_few_shot: 3
	metrics:
	- type: acc
	value: 40.89
	name: accuracy
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: OAB Exams
	type: eduagarcia/oab_exams
	split: train
	args:
	num_few_shot: 3
	metrics:
	- type: acc
	value: 35.31
	name: accuracy
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Assin2 RTE
	type: assin2
	split: test
	args:
	num_few_shot: 15
	metrics:
	- type: f1_macro
	value: 85.55
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Assin2 STS
	type: eduagarcia/portuguese_benchmark
	split: test
	args:
	num_few_shot: 15
	metrics:
	- type: pearson
	value: 57.19
	name: pearson
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: FaQuAD NLI
	type: ruanchaves/faquad-nli
	split: test
	args:
	num_few_shot: 15
	metrics:
	- type: f1_macro
	value: 45.45
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HateBR Binary
	type: ruanchaves/hatebr
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: f1_macro
	value: 77.78
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: PT Hate Speech Binary
	type: hate_speech_portuguese
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: f1_macro
	value: 64.13
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: tweetSentBR
	type: eduagarcia-temp/tweetsentbr
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: f1_macro
	value: 53.37
	name: f1-macro
	source:
	url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/cabra13b
	name: Open Portuguese LLM Leaderboard
	---

	Conheça os nossos outros modelos (bem melhores): [Cabra](https://huggingface.co/collections/botbot-ai/models-6604c2069ceef04f834ba99b)

	O Cabra 13b é um qlora finetune do [LLaMA 2 13b Chat](https://huggingface.co/meta-llama/Llama-2-13b-chat) usando o dataset [PortugueseDolly]( https://huggingface.co/datasets/nicolasdec/PortugueseDolly), tradução do [Databricks Dolly 15k]( https://huggingface.co/datasets/databricks/databricks-dolly-15k).

	*Somente para demonstração e pesquisa. Proibido para uso comercial.

	O modelo precisa de mais treinamento, e pode gerar mentira ou inverdades.


	# [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/nicolasdec/cabra13b)

	\| Metric \| Value \|
	\|--------------------------\|--------\|
	\|Average \|56.5\|
	\|ENEM Challenge (No Images)\| 48.85\|
	\|BLUEX (No Images) \| 40.89\|
	\|OAB Exams \| 35.31\|
	\|Assin2 RTE \| 85.55\|
	\|Assin2 STS \| 57.19\|
	\|FaQuAD NLI \| 45.45\|
	\|HateBR Binary \| 77.78\|
	\|PT Hate Speech Binary \| 64.13\|
	\|tweetSentBR \| 53.37\|