Description
This model was developed by Kundyz Maksutova, PhD Candidate, as part of research on question-answering systems in the Kazakh language. It is a fine-tuned version of FacebookAI/xlm-roberta-large
on the Kundyzka/informatics_kaz
dataset, specifically optimized for handling questions in the domain of computer science.
Key Features:
- Base Model:
FacebookAI/xlm-roberta-large
- Dataset:
Kundyzka/informatics_kaz
- Language: Kazakh (
kk
) - Task: Question Answering
- Performance:
- Before Training:
- F1 Score: 26.950
- Exact Match: 13.116
- After Training:
- F1 Score: 70.127
- Exact Match: 49.740
- Before Training:
Dataset:
The Kundyzka/informatics_kaz
dataset is designed to provide a diverse set of questions and answers in Kazakh, specifically covering topics in computer science. This dataset ensures that the model effectively handles domain-specific queries and terminology.
Intended Use:
This model is intended for answering questions in the Kazakh language, with potential applications in:
- Educational Platforms: Assisting students with computer science-related questions.
- Research Projects: Supporting the study and development of Kazakh natural language processing tools.
- AI Applications: Enhancing chatbots or intelligent systems requiring domain-specific question-answering capabilities.
Limitations and Ethical Considerations:
- Domain-Specific Bias: The model performs best on computer science queries and may not generalize well to other domains.
- Dataset Bias: The dataset may introduce biases that affect model predictions.
- Language Support: The model is optimized for Kazakh and does not handle other languages.
Tags:
computerscience
question-answering
Kazakh
This model represents a significant contribution to improving natural language processing tools for low-resource languages like Kazakh. For further details or customization, refer to the model repository.
- Downloads last month
- 0
Model tree for Kundyzka/XLM-Roberta-large-informatics-kaz
Base model
FacebookAI/xlm-roberta-large