--- language: - pt license: llama2 library_name: transformers tags: - llama - peft - portuguese - instruct pipeline_tag: text-generation model-index: - name: boana-7b-instruct results: - task: type: text-generation dataset: name: XWinograd (pt) type: Muennighoff/xwinograd config: pt split: test metrics: - type: Accuracy value: 50.57 - task: type: text-generation name: Text Generation dataset: name: ENEM Challenge (No Images) type: eduagarcia/enem_challenge split: train args: num_few_shot: 3 metrics: - type: acc value: 21.62 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=lrds-code/boana-7b-instruct name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BLUEX (No Images) type: eduagarcia-temp/BLUEX_without_images split: train args: num_few_shot: 3 metrics: - type: acc value: 29.21 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=lrds-code/boana-7b-instruct name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: OAB Exams type: eduagarcia/oab_exams split: train args: num_few_shot: 3 metrics: - type: acc value: 27.15 name: accuracy source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=lrds-code/boana-7b-instruct name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Assin2 RTE type: assin2 split: test args: num_few_shot: 15 metrics: - type: f1_macro value: 48.84 name: f1-macro - type: pearson value: 37.56 name: pearson source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=lrds-code/boana-7b-instruct name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: FaQuAD NLI type: ruanchaves/faquad-nli split: test args: num_few_shot: 15 metrics: - type: f1_macro value: 43.97 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=lrds-code/boana-7b-instruct name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HateBR Binary type: eduagarcia/portuguese_benchmark split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 85.0 name: f1-macro - type: f1_macro value: 67.43 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=lrds-code/boana-7b-instruct name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: tweetSentBR type: eduagarcia-temp/tweetsentbr split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 40.38 name: f1-macro source: url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=lrds-code/boana-7b-instruct name: Open Portuguese LLM Leaderboard ---
Boana-7B-Instruct é um LLM treinado em dados da língua portuguesa. O modelo é baseado no [LLaMA2-7B](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), uma versão de 7B de parâmetros do LLaMA-2. O projeto Boana tem como objetivo oferecer opções de LLM em língua portuguesa, ao mesmo tempo que disponibiliza um modelo menos complexo para que, dessa forma, usuários com menos poder computacional possam usufruir das LLMs. Em apoio aos países de língua portuguesa.
### Descrição do Modelo - **Desenvolvido por:** [Leonardo Souza](https://huggingface.co/lrds-code) - **Tipo do modelo:** LLaMA-Based - **Licença:** Academic Free License v3.0 - **Fine-tunado do modelo:** [LLaMA2-7B](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) ## Como Usar ```python import torch from transformers import pipeline boana = pipeline('text-generation', model='lrds-code/boana-7b-instruct', torch_dtype=torch.bfloat16, device_map='auto') messages = [{'role':'system', 'content':''}, {'role':'user', 'content':'Quantos planetas existem no sistema solar?'}] prompt = boana.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) outputs = boana(prompt, max_new_tokens=256, do_sample=False, temperature=0, top_k=50, top_p=0.95) print(outputs[0]['generated_text']) #