--- license: apache-2.0 datasets: - cybersectony/PhishingEmailDetection library_name: transformers language: - en base_model: - distilbert/distilbert-base-uncased tags: - Phishing - Email - URL - Detection --- **A distilBERT based Phishing Email Detection Model** **Model Overview** This model is specifically fine-tuned for detecting phishing emails using the Hugging Face Trainer API. **Key Specifications** - __Base Architecture:__ DistilBERT - __Task:__ Multilabel Classification - __Fine-tuning Framework:__ Hugging Face Trainer API - __Training Duration:__ 3 epochs **Performance Metrics** - __F1-score:__ 97.717 - __Accuracy:__ 97.716 - __Precision:__ 97.736 - __Recall:__ 97.717 **Dataset Details** This model was trained using a [Phishing Email Detection Dataset](https://huggingface.co/datasets/cybersectony/PhishingEmailDetection). **Usage Guide** **Installation** ```bash pip install transformers pip install torch ``` ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch # Load model and tokenizer tokenizer = AutoTokenizer.from_pretrained("your-username/model-name") model = AutoModelForSequenceClassification.from_pretrained("your-username/model-name") def predict_phishing(email_text): # Preprocess and tokenize inputs = tokenizer(email_text, return_tensors="pt", truncation=True, max_length=512) # Get prediction with torch.no_grad(): outputs = model(**inputs) predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) return { "is_phishing": bool(predictions[0][1] > 0.5), "confidence": float(predictions[0][1]) } # Example usage email = "Your email text here..." result = predict_phishing(email) print(f"Is Phishing: {result['is_phishing']}") print(f"Confidence: {result['confidence']:.2%}") ```