license: other
license_name: govtech-singapore
license_link: LICENSE
datasets:
- gabrielchua/off-topic
language:
- en
metrics:
- roc_auc
- f1
- precision
- recall
base_model:
- cross-encoder/stsb-roberta-base
Off-Topic Classification Model
This model leverages a fine-tuned Cross Encoder STSB Roberta Base to perform binary classification, determining whether a user prompt is off-topic in relation to the system's intended purpose as defined by the system prompt.
Model Highlights
- Base Model:
stsb-roberta-base
- Maximum Context Length: 514 tokens
- Task: Binary classification (on-topic/off-topic)
Performance
We evaluated our fine-tuned models on synthetic data modelling system and user prompt pairs reflecting real world enterprise use cases of LLMs. The dataset is available here.
Approach | Model | ROC-AUC | F1 | Precision | Recall |
---|---|---|---|---|---|
👉 Fine-tuned bi-encoder classifier | jina-embeddings-v2-small-en | 0.99 | 0.97 | 0.99 | 0.95 |
Fine-tuned cross-encoder classifier | stsb-roberta-base | 0.99 | 0.99 | 0.99 | 0.99 |
Pre-trained cross-encoder | stsb-roberta-base | 0.73 | 0.68 | 0.53 | 0.93 |
Prompt Engineering | GPT 4o (2024-08-06) | - | 0.95 | 0.94 | 0.97 |
Prompt Engineering | GPT 4o Mini (2024-07-18) | - | 0.91 | 0.85 | 0.91 |
Zero-shot Classification | GPT 4o Mini (2024-07-18) | 0.99 | 0.97 | 0.95 | 0.99 |
Further evaluation results on additional synthetic and external datasets (e.g.,JailbreakBench
, HarmBench
, TrustLLM
) are available in our technical report.
Usage
Clone this repository and install the required dependencies:
pip install -r requirements.txt
You can run the model using two options:
Option 1: Using
inference_onnx.py
with the ONNX Model.``` python inference_onnx.py '[ ["System prompt example 1", "User prompt example 1"], ["System prompt example 2", "System prompt example 2] ]' ```
Option 2: Using
inference_safetensors.py
with PyTorch and SafeTensors.``` python inference_safetensors.py '[ ["System prompt example 1", "User prompt example 1"], ["System prompt example 2", "System prompt example 2] ]' ```
Read more about this model in our technical report.