indo-sbert-nli-classifier-step-3

A BERT-based model fine-tuned for Natural Language Inference with a classification head.

Model Details

This model is a fine-tuned version of firqaaa/indo-sentence-bert-base for Natural Language Inference (NLI) tasks in Indonesian. It uses a classifier-based approach to determine the inferential relationship between a premise and hypothesis, classifying it as entailment, neutral, or contradiction.

Training Data

The model was fine-tuned on the afaji/indonli dataset, which contains Indonesian premise-hypothesis pairs labeled with entailment, neutral, or contradiction.

Evaluation Results

Validation loss: 0.8371, accuracy: 0.6331 Test Lay loss: 0.8810, accuracy: 0.6215 Test Expert loss: 1.0823, accuracy: 0.5137

Usage

from transformers import AutoModel, AutoTokenizer
import torch
import torch.nn as nn
import torch.nn.functional as F

# Load model and tokenizer
bert = AutoModel.from_pretrained("fabhiansan/indo-sbert-nli-classifier")
tokenizer = AutoTokenizer.from_pretrained("fabhiansan/indo-sbert-nli-classifier")

# Load classifier weights
classifier_path = "classifier.pt"  # This file is included in the model repository
classifier = nn.Linear(bert.config.hidden_size * 3, 3)  # For entailment, neutral, contradiction
classifier.load_state_dict(torch.load(classifier_path, map_location=torch.device("cpu")))

# Function for mean pooling
def mean_pooling(token_embeddings, attention_mask):
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

# Example NLI inputs
premise = "Keindahan alam yang terdapat di Gunung Batu Jonggol ini dapat Anda manfaatkan sebagai objek fotografi yang cantik."
hypothesis = "Keindahan alam tidak dapat difoto."

# Encode inputs
encoded_premise = tokenizer(premise, padding=True, truncation=True, return_tensors="pt")
encoded_hypothesis = tokenizer(hypothesis, padding=True, truncation=True, return_tensors="pt")

# Get embeddings
with torch.no_grad():
    outputs_premise = bert(**encoded_premise)
    outputs_hypothesis = bert(**encoded_hypothesis)
    
    # Mean pooling
    embedding_premise = mean_pooling(outputs_premise.last_hidden_state, encoded_premise["attention_mask"])
    embedding_hypothesis = mean_pooling(outputs_hypothesis.last_hidden_state, encoded_hypothesis["attention_mask"])
    
    # Concatenate embeddings with element-wise difference
    diff = torch.abs(embedding_premise - embedding_hypothesis)
    concatenated = torch.cat([embedding_premise, embedding_hypothesis, diff], dim=1)
    
    # Get logits and predictions
    logits = classifier(concatenated)
    predictions = F.softmax(logits, dim=1)

# Map predictions to labels
id2label = {0: "entailment", 1: "neutral", 2: "contradiction"}
predicted_class_id = predictions.argmax().item()
predicted_label = id2label[predicted_class_id]

print(f"Premise: {premise}")
print(f"Hypothesis: {hypothesis}")
print(f"Prediction: {predicted_label}")
print(f"Probabilities: {predictions[0].tolist()}")

Limitations and Biases

  • The model is specifically trained for Indonesian language and may not perform well on other languages or code-switched text.
  • Performance may vary on domain-specific texts that differ significantly from the training data.
  • Like all language models, this model may reflect biases present in the training data.

Citation

If you use this model in your research, please cite:

@misc{fabhiansan2025indonli,
  author = {Fabhiansan},
  title = {Fine-tuned SBERT for Indonesian Natural Language Inference},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/indo-sbert-nli-classifier-step-3}}
}

And also cite the original SBERT and Indo-SBERT works:

@inproceedings{reimers-2019-sentence-bert,
  title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
  author = "Reimers, Nils and Gurevych, Iryna",
  booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
  month = "11",
  year = "2019",
  publisher = "Association for Computational Linguistics",
  url = "https://arxiv.org/abs/1908.10084",
}
@misc{arasyi2022indo,
  author = {Arasyi, Firqa},
  title = {indo-sentence-bert: Sentence Transformer for Bahasa Indonesia with Multiple Negative Ranking Loss},
  year = {2022},
  month = {9},
  publisher = {huggingface},
  journal = {huggingface repository},
  howpublished = {https://huggingface.co/firqaaa/indo-sentence-bert-base}
}
Downloads last month
2
Safetensors
Model size
124M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train fabhiansan/indo-sbert-nli-classifier-step-3