jagan-raj
/

PhishMail

Text Classification

Inference Endpoints

Model card Files Files and versions Community

jagan-raj commited on Jan 11

Commit

0daa27d

·

verified ·

1 Parent(s): 5bba431

Update README.md

Files changed (1) hide show

README.md +58 -0

README.md CHANGED Viewed

@@ -4,7 +4,15 @@ base_model:
 - google-bert/bert-base-uncased
 datasets:
 - zefang-liu/phishing-email-dataset
 ---
 # PhishMail - BERT Model for Phishing Detection
 This repository features a fine-tuned BERT model designed to detect phishing emails.
@@ -31,4 +39,54 @@ The model is trained to classify emails as either phishing or legitimate by anal
 ```bash
 !pip install transformers torch
 ```

 - google-bert/bert-base-uncased
 datasets:
 - zefang-liu/phishing-email-dataset
+language:
+- en
+metrics:
+- accuracy
+tags:
+- security
 ---
 # PhishMail - BERT Model for Phishing Detection
 This repository features a fine-tuned BERT model designed to detect phishing emails.
 ```bash
 !pip install transformers torch
+```
+**Step 2:** Loading the Model:
+```bash
+from transformers import BertForSequenceClassification, BertTokenizer
+import torch
+# Specify the Hugging Face model repository name
+model_name = 'jagan-raj/PhishMail'
+# Load the fine-tuned BERT model for phishing detection
+model = BertForSequenceClassification.from_pretrained(model_name)
+# Load the corresponding tokenizer for the fine-tuned model
+tokenizer = BertTokenizer.from_pretrained(model_name)
+# Set the model to evaluation mode for inference
+model.eval()
+```
+**Step 3:** Using the Model for Predictions:
+```bash
+# Input the email text for classification
+email_text = "Your email content here"
+# Tokenize and preprocess the input text
+# Converts the email text into token IDs, applies truncation/padding, and creates a tensor
+inputs = tokenizer(
+    email_text,
+    return_tensors="pt",        # Output tensors in PyTorch format
+    truncation=True,            # Truncate the text if it exceeds the max_length
+    padding='max_length'        # Pad the text to the maximum sequence length
+)
+# Make a prediction using the model
+with torch.no_grad():           # Disable gradient calculations for faster inference
+    outputs = model(**inputs)   # Get model outputs
+    logits = outputs.logits     # Extract raw prediction scores (logits)
+    predictions = torch.argmax(logits, dim=-1)  # Determine the predicted class (0 or 1)
+# Interpret the prediction result
+# Map the prediction to its corresponding label: 1 for "Phishing", 0 for "Legitimate"
+result = "This is a phishing email." if predictions.item() == 1 else "This is a legitimate email."
+# Print the prediction result
+print(f"Prediction: {result}")
 ```