srbNLI: Serbian Natural Language Inference Model
Model Overview
srbNLI is a fine-tuned Natural Language Inference (NLI) model for Serbian, created by adapting the SciFact dataset. The model is based on state-of-the-art transformer architectures. It is trained to recognize relationships between claims and evidence in Serbian text, with applications in scientific claim verification and potential expansion to broader claim verification tasks.
Key Details
- Model Type: Transformer-based
- Language: Serbian
- Task: Natural Language Inference (NLI), Textual Entailment, Claim Verification
- Dataset: srbSciFact (automatically translated SciFact dataset)
- Fine-tuning: Fine-tuned on Serbian NLI data (support, contradiction, and neutral categories).
- Metrics: Accuracy, Precision, Recall, F1-score
Motivation
This model addresses the lack of NLI datasets and models for Serbian, a low-resource language. It provides a tool for textual entailment and claim verification, especially for scientific claims, with broader potential for misinformation detection and automated fact-checking.
Training
- Base Models Used: DeBERTa-v3-large
- Training Data: Automatically translated SciFact dataset
- Fine-tuning: Conducted on a single DGX NVIDIA A100 GPU (40 GB)
- Hyperparameters: Optimized learning rate, batch size, weight decay, epochs, and early stopping
Evaluation
The model was evaluated using standard NLI metrics (accuracy, precision, recall, F1-score). It was also compared to the GPT-4o model for generalization capabilities.
Use Cases
- Claim Verification: Scientific claims and general domain claims in Serbian
- Misinformation Detection: Identifying contradictions or support between claims and evidence
- Cross-lingual Applications: Potential for cross-lingual claim verification with multilingual models
Future Work
- Improving accuracy with human-corrected translations and Serbian-specific datasets
- Expanding to general-domain claim verification
- Enhancing multilingual NLI capabilities
Results Comparison
The table below presents a comparison of the fine-tuned models (DeBERTa-v3-large, RoBERTa-large, BERTić, GPT-4o, and others) on the srbSciFact dataset, focusing on key metrics: Accuracy (Acc), Precision (P), Recall (R), and F1-score (F1). The models were evaluated on their ability to classify relationships between claims and evidence in Serbian text.
Model | Accuracy | Precision (P) | Recall (R) | F1-score (F1) |
---|---|---|---|---|
DeBERTa-v3-large | 0.70 | 0.86 | 0.82 | 0.84 |
RoBERTa-large | 0.57 | 0.63 | 0.76 | 0.69 |
BERTić (Serbian) | 0.56 | 0.56 | 0.37 | 0.44 |
GPT-4o (English) | 0.66 | 0.70 | 0.77 | 0.78 |
mDeBERTa-base | 0.63 | 0.92 | 0.75 | 0.83 |
XLM-RoBERTa-large | 0.64 | 0.89 | 0.77 | 0.83 |
mBERT-cased | 0.48 | 0.76 | 0.50 | 0.60 |
mBERT-uncased | 0.57 | 0.45 | 0.61 | 0.52 |
Observations
- DeBERTa-v3-large performed the best overall, with an accuracy of 0.70 and an F1-score of 0.84.
- RoBERTa-large and BERTić showed lower performance, especially in recall, suggesting challenges in handling complex linguistic inference in Serbian.
- GPT-4o outperforms all fine-tuned models in F1-score when the prompt is in English, but the DeBERTa-v3-large model slightly outperforms GPT-4o when the prompt is in Serbian.
- mDeBERTa-base and XLM-RoBERTa-large exhibited strong cross-lingual performance, with F1-scores of 0.83 and 0.83, respectively.
This demonstrates the potential of adapting advanced transformer models to Serbian while highlighting areas for future improvement, such as refining translations and expanding domain-specific data.
- Downloads last month
- 3
Model tree for MilosKosRad/ScientificNLIsrb
Base model
microsoft/deberta-v3-large