cybersectony's picture
Update README.md
0236b92 verified
|
raw
history blame
1.85 kB
---
license: apache-2.0
datasets:
- cybersectony/PhishingEmailDetection
library_name: transformers
language:
- en
base_model:
- distilbert/distilbert-base-uncased
tags:
- Phishing
- Email
- URL
- Detection
---
**A distilBERT based Phishing Email Detection Model**
**Model Overview**
This model is specifically fine-tuned for detecting phishing emails using the Hugging Face Trainer API.
**Key Specifications**
- __Base Architecture:__ DistilBERT
- __Task:__ Multilabel Classification
- __Fine-tuning Framework:__ Hugging Face Trainer API
- __Training Duration:__ 3 epochs
**Performance Metrics**
- __F1-score:__ 97.717
- __Accuracy:__ 97.716
- __Precision:__ 97.736
- __Recall:__ 97.717
**Dataset Details**
This model was trained using a [Phishing Email Detection Dataset](https://huggingface.co/datasets/cybersectony/PhishingEmailDetection).
**Usage Guide**
**Installation**
```bash
pip install transformers
pip install torch
```
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("your-username/model-name")
model = AutoModelForSequenceClassification.from_pretrained("your-username/model-name")
def predict_phishing(email_text):
# Preprocess and tokenize
inputs = tokenizer(email_text, return_tensors="pt", truncation=True, max_length=512)
# Get prediction
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
return {
"is_phishing": bool(predictions[0][1] > 0.5),
"confidence": float(predictions[0][1])
}
# Example usage
email = "Your email text here..."
result = predict_phishing(email)
print(f"Is Phishing: {result['is_phishing']}")
print(f"Confidence: {result['confidence']:.2%}")
```