InfectA-Chat

To prevent adversial effects of infectious diseases, clear and accessible communication, tracking infectious diseases regularly is crucial. InfectA-Chat is a generative model specifically designed to address this need. Built upon the powerful AceGPT-7B-Chat pre-trained model, InfectA-Chat is fine-tuned to track infectious diseases outbreaks in the infectious diseases domain. This makes it a valuable tool for facilitating communication in both Arabic and English, potentially bridging language barriers and fostering a deeper understanding of infectious diseases.

Model Details

In the fight against infectious diseases in the Middle East, clear and effective communication is paramount. We're excited to announce the release of InfectA-Chat, a generative text model fine-tuned on the AceGPT-7B-Chat model. Designed specifically for the Arabic and English languages, InfectA-Chat excels at following instructions related to infectious disease topics. Notably, our models outperform existing Arabic and state-of-the-art LLMs on Q&A task involving infectious disease instructions while competing with GPT-4. This advancement has the potential to significantly improve communication and disease tracking efforts in the specific region.

  • Developed by: Korea Institute of Science and Technology
  • Language(s) (NLP): Arabic, English
  • License: Creative Commons Attribution 2.0
  • Finetuned from model [optional]: AceGPT-7B-Chat
  • Repository: KISTI-AI/InfectA-Chat

Training Details

Training Data

InfectA-Chat was instruction fine-tuned with 55,400 infectious diseases-related instruction-following data.

Training Procedure

This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure.

Training Hyperparameters

  • Training regime: fp32

Evaluation

Evaluation Results on Infectious Diseases-related Instruction-Following Dataset

Experiments on infectious diseases-related instruction-following data and Arabic MMLU benchmark dataset. ‘STEM’, ‘Humanities’, ‘Social Sciences’, ‘Others’ belong to Arabic MMLU.

image/png

image/png

image/png

Evaluation Results on Arabic MMLU Benchmark Dataset

image/png

Downloads last month
11
Safetensors
Model size
6.74B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.