|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- cybersectony/PhishingEmailDetection |
|
library_name: transformers |
|
language: |
|
- en |
|
base_model: |
|
- distilbert/distilbert-base-uncased |
|
tags: |
|
- Phishing |
|
- Email |
|
- URL |
|
- Detection |
|
--- |
|
|
|
**A distilBERT based Phishing Email Detection Model** |
|
|
|
**Model Overview** |
|
|
|
This model is specifically fine-tuned for detecting phishing emails using the Hugging Face Trainer API. |
|
|
|
**Key Specifications** |
|
- __Base Architecture:__ DistilBERT |
|
- __Task:__ Multilabel Classification |
|
- __Fine-tuning Framework:__ Hugging Face Trainer API |
|
- __Training Duration:__ 3 epochs |
|
|
|
**Performance Metrics** |
|
- __F1-score:__ 97.717 |
|
- __Accuracy:__ 97.716 |
|
- __Precision:__ 97.736 |
|
- __Recall:__ 97.717 |
|
|
|
**Dataset Details** |
|
|
|
This model was trained using a [Phishing Email Detection Dataset](https://huggingface.co/datasets/cybersectony/PhishingEmailDetection). |
|
|
|
**Usage Guide** |
|
**Installation** |
|
|
|
```bash |
|
pip install transformers |
|
pip install torch |
|
``` |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
import torch |
|
|
|
# Load model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("your-username/model-name") |
|
model = AutoModelForSequenceClassification.from_pretrained("your-username/model-name") |
|
|
|
def predict_phishing(email_text): |
|
# Preprocess and tokenize |
|
inputs = tokenizer(email_text, return_tensors="pt", truncation=True, max_length=512) |
|
|
|
# Get prediction |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) |
|
|
|
return { |
|
"is_phishing": bool(predictions[0][1] > 0.5), |
|
"confidence": float(predictions[0][1]) |
|
} |
|
|
|
# Example usage |
|
email = "Your email text here..." |
|
result = predict_phishing(email) |
|
print(f"Is Phishing: {result['is_phishing']}") |
|
print(f"Confidence: {result['confidence']:.2%}") |
|
``` |