srbNLI: Serbian Natural Language Inference Model

Model Overview

srbNLI is a fine-tuned Natural Language Inference (NLI) model for Serbian, created by adapting the SciFact dataset. The model is based on state-of-the-art transformer architectures. It is trained to recognize relationships between claims and evidence in Serbian text, with applications in scientific claim verification and potential expansion to broader claim verification tasks.

Key Details

  • Model Type: Transformer-based
  • Language: Serbian
  • Task: Natural Language Inference (NLI), Textual Entailment, Claim Verification
  • Dataset: srbSciFact (automatically translated SciFact dataset)
  • Fine-tuning: Fine-tuned on Serbian NLI data (support, contradiction, and neutral categories).
  • Metrics: Accuracy, Precision, Recall, F1-score

Motivation

This model addresses the lack of NLI datasets and models for Serbian, a low-resource language. It provides a tool for textual entailment and claim verification, especially for scientific claims, with broader potential for misinformation detection and automated fact-checking.

Training

  • Base Models Used: DeBERTa-v3-large
  • Training Data: Automatically translated SciFact dataset
  • Fine-tuning: Conducted on a single DGX NVIDIA A100 GPU (40 GB)
  • Hyperparameters: Optimized learning rate, batch size, weight decay, epochs, and early stopping

Evaluation

The model was evaluated using standard NLI metrics (accuracy, precision, recall, F1-score). It was also compared to the GPT-4o model for generalization capabilities.

Use Cases

  • Claim Verification: Scientific claims and general domain claims in Serbian
  • Misinformation Detection: Identifying contradictions or support between claims and evidence
  • Cross-lingual Applications: Potential for cross-lingual claim verification with multilingual models

Future Work

  • Improving accuracy with human-corrected translations and Serbian-specific datasets
  • Expanding to general-domain claim verification
  • Enhancing multilingual NLI capabilities

Results Comparison

The table below presents a comparison of the fine-tuned models (DeBERTa-v3-large, RoBERTa-large, BERTić, GPT-4o, and others) on the srbSciFact dataset, focusing on key metrics: Accuracy (Acc), Precision (P), Recall (R), and F1-score (F1). The models were evaluated on their ability to classify relationships between claims and evidence in Serbian text.

Model Accuracy Precision (P) Recall (R) F1-score (F1)
DeBERTa-v3-large 0.70 0.86 0.82 0.84
RoBERTa-large 0.57 0.63 0.76 0.69
BERTić (Serbian) 0.56 0.56 0.37 0.44
GPT-4o (English) 0.66 0.70 0.77 0.78
mDeBERTa-base 0.63 0.92 0.75 0.83
XLM-RoBERTa-large 0.64 0.89 0.77 0.83
mBERT-cased 0.48 0.76 0.50 0.60
mBERT-uncased 0.57 0.45 0.61 0.52

Observations

  • DeBERTa-v3-large performed the best overall, with an accuracy of 0.70 and an F1-score of 0.84.
  • RoBERTa-large and BERTić showed lower performance, especially in recall, suggesting challenges in handling complex linguistic inference in Serbian.
  • GPT-4o outperforms all fine-tuned models in F1-score when the prompt is in English, but the DeBERTa-v3-large model slightly outperforms GPT-4o when the prompt is in Serbian.
  • mDeBERTa-base and XLM-RoBERTa-large exhibited strong cross-lingual performance, with F1-scores of 0.83 and 0.83, respectively.

This demonstrates the potential of adapting advanced transformer models to Serbian while highlighting areas for future improvement, such as refining translations and expanding domain-specific data.

Downloads last month
3
Safetensors
Model size
435M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MilosKosRad/ScientificNLIsrb

Finetuned
(138)
this model