Description

This model was developed by Kundyz Maksutova, PhD Candidate, as part of research on question-answering systems in the Kazakh language. It is a fine-tuned version of FacebookAI/xlm-roberta-large on the Kundyzka/informatics_kaz dataset, specifically optimized for handling questions in the domain of computer science.

Key Features:

  • Base Model: FacebookAI/xlm-roberta-large
  • Dataset: Kundyzka/informatics_kaz
  • Language: Kazakh (kk)
  • Task: Question Answering
  • Performance:
    • Before Training:
      • F1 Score: 26.950
      • Exact Match: 13.116
    • After Training:
      • F1 Score: 70.127
      • Exact Match: 49.740

Dataset:

The Kundyzka/informatics_kaz dataset is designed to provide a diverse set of questions and answers in Kazakh, specifically covering topics in computer science. This dataset ensures that the model effectively handles domain-specific queries and terminology.

Intended Use:

This model is intended for answering questions in the Kazakh language, with potential applications in:

  • Educational Platforms: Assisting students with computer science-related questions.
  • Research Projects: Supporting the study and development of Kazakh natural language processing tools.
  • AI Applications: Enhancing chatbots or intelligent systems requiring domain-specific question-answering capabilities.

Limitations and Ethical Considerations:

  • Domain-Specific Bias: The model performs best on computer science queries and may not generalize well to other domains.
  • Dataset Bias: The dataset may introduce biases that affect model predictions.
  • Language Support: The model is optimized for Kazakh and does not handle other languages.

Tags:

  • computerscience
  • question-answering
  • Kazakh

This model represents a significant contribution to improving natural language processing tools for low-resource languages like Kazakh. For further details or customization, refer to the model repository.

Downloads last month
0
Safetensors
Model size
559M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Kundyzka/XLM-Roberta-large-informatics-kaz

Adapter
(23)
this model

Dataset used to train Kundyzka/XLM-Roberta-large-informatics-kaz