cybersectony's picture
Update README.md
0236b92 verified
|
raw
history blame
1.85 kB
metadata
license: apache-2.0
datasets:
  - cybersectony/PhishingEmailDetection
library_name: transformers
language:
  - en
base_model:
  - distilbert/distilbert-base-uncased
tags:
  - Phishing
  - Email
  - URL
  - Detection

A distilBERT based Phishing Email Detection Model

Model Overview

This model is specifically fine-tuned for detecting phishing emails using the Hugging Face Trainer API.

Key Specifications

  • Base Architecture: DistilBERT
  • Task: Multilabel Classification
  • Fine-tuning Framework: Hugging Face Trainer API
  • Training Duration: 3 epochs

Performance Metrics

  • F1-score: 97.717
  • Accuracy: 97.716
  • Precision: 97.736
  • Recall: 97.717

Dataset Details

This model was trained using a Phishing Email Detection Dataset.

Usage Guide Installation

pip install transformers
pip install torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("your-username/model-name")
model = AutoModelForSequenceClassification.from_pretrained("your-username/model-name")

def predict_phishing(email_text):
    # Preprocess and tokenize
    inputs = tokenizer(email_text, return_tensors="pt", truncation=True, max_length=512)
    
    # Get prediction
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    
    return {
        "is_phishing": bool(predictions[0][1] > 0.5),
        "confidence": float(predictions[0][1])
    }

# Example usage
email = "Your email text here..."
result = predict_phishing(email)
print(f"Is Phishing: {result['is_phishing']}")
print(f"Confidence: {result['confidence']:.2%}")