indo-sbert-nli-classifier-step-3

A BERT-based model fine-tuned for Natural Language Inference with a classification head.

Model Details

This model is a fine-tuned version of firqaaa/indo-sentence-bert-base for Natural Language Inference (NLI) tasks in Indonesian. It uses a classifier-based approach to determine the inferential relationship between a premise and hypothesis, classifying it as entailment, neutral, or contradiction.

Training Data

The model was fine-tuned on the afaji/indonli dataset, which contains Indonesian premise-hypothesis pairs labeled with entailment, neutral, or contradiction.

Evaluation Results

Validation loss: 0.8371, accuracy: 0.6331 Test Lay loss: 0.8810, accuracy: 0.6215 Test Expert loss: 1.0823, accuracy: 0.5137

Usage

from transformers import AutoModel, AutoTokenizer
import torch
import torch.nn as nn
import torch.nn.functional as F

# Load model and tokenizer
bert = AutoModel.from_pretrained("fabhiansan/indo-sbert-nli-classifier")
tokenizer = AutoTokenizer.from_pretrained("fabhiansan/indo-sbert-nli-classifier")

# Load classifier weights
classifier_path = "classifier.pt"  # This file is included in the model repository
classifier = nn.Linear(bert.config.hidden_size * 3, 3)  # For entailment, neutral, contradiction
classifier.load_state_dict(torch.load(classifier_path, map_location=torch.device("cpu")))

# Function for mean pooling
def mean_pooling(token_embeddings, attention_mask):
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

# Example NLI inputs
premise = "Keindahan alam yang terdapat di Gunung Batu Jonggol ini dapat Anda manfaatkan sebagai objek fotografi yang cantik."
hypothesis = "Keindahan alam tidak dapat difoto."

# Encode inputs
encoded_premise = tokenizer(premise, padding=True, truncation=True, return_tensors="pt")
encoded_hypothesis = tokenizer(hypothesis, padding=True, truncation=True, return_tensors="pt")

# Get embeddings
with torch.no_grad():
    outputs_premise = bert(**encoded_premise)
    outputs_hypothesis = bert(**encoded_hypothesis)
    
    # Mean pooling
    embedding_premise = mean_pooling(outputs_premise.last_hidden_state, encoded_premise["attention_mask"])
    embedding_hypothesis = mean_pooling(outputs_hypothesis.last_hidden_state, encoded_hypothesis["attention_mask"])
    
    # Concatenate embeddings with element-wise difference
    diff = torch.abs(embedding_premise - embedding_hypothesis)
    concatenated = torch.cat([embedding_premise, embedding_hypothesis, diff], dim=1)
    
    # Get logits and predictions
    logits = classifier(concatenated)
    predictions = F.softmax(logits, dim=1)

# Map predictions to labels
id2label = {0: "entailment", 1: "neutral", 2: "contradiction"}
predicted_class_id = predictions.argmax().item()
predicted_label = id2label[predicted_class_id]

print(f"Premise: {premise}")
print(f"Hypothesis: {hypothesis}")
print(f"Prediction: {predicted_label}")
print(f"Probabilities: {predictions[0].tolist()}")

Limitations and Biases

The model is specifically trained for Indonesian language and may not perform well on other languages or code-switched text.
Performance may vary on domain-specific texts that differ significantly from the training data.
Like all language models, this model may reflect biases present in the training data.

Citation

If you use this model in your research, please cite:

@misc{fabhiansan2025indonli,
  author = {Fabhiansan},
  title = {Fine-tuned SBERT for Indonesian Natural Language Inference},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/indo-sbert-nli-classifier-step-3}}
}

And also cite the original SBERT and Indo-SBERT works:

@inproceedings{reimers-2019-sentence-bert,
  title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
  author = "Reimers, Nils and Gurevych, Iryna",
  booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
  month = "11",
  year = "2019",
  publisher = "Association for Computational Linguistics",
  url = "https://arxiv.org/abs/1908.10084",
}

@misc{arasyi2022indo,
  author = {Arasyi, Firqa},
  title = {indo-sentence-bert: Sentence Transformer for Bahasa Indonesia with Multiple Negative Ranking Loss},
  year = {2022},
  month = {9},
  publisher = {huggingface},
  journal = {huggingface repository},
  howpublished = {https://huggingface.co/firqaaa/indo-sentence-bert-base}
}

fabhiansan
/

indo-sbert-nli-classifier-step-3