sarahwei's picture
Create README.md
3ff2afa verified
metadata
datasets:
  - sarahwei/cyber_MITRE_tactic_CTI_dataset_v16
language:
  - en
metrics:
  - accuracy
base_model:
  - bencyc1129/mitre-bert-base-cased
pipeline_tag: text-classification
library_name: transformers

MITRE-v16-tactic-bert-case-based

It's a fine-tuned model from mitre-bert-base-cased on the MITRE ATT&CK version 16 procedure dataset.

Intended uses & limitations

You can use the fine-tuned model for text classification. It aims to identify the tactic that the sentence belongs to in MITRE ATT&CK framework. A sentence or an attack may fall into several tactics.

Note that this model is primarily fine-tuned on text classification for cybersecurity. It may not perform well if the sentence is not related to attacks.

How to use

You can use the model with Tensorflow.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "sarahwei/MITRE-v16-tactic-bert-case-based"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
)
question = 'An attacker performs a SQL injection.'
input_ids = tokenizer(question,return_tensors="pt")
outputs = model(**input_ids)
logits = outputs.logits
sigmoid = torch.nn.Sigmoid()
probs = sigmoid(logits.squeeze().cpu())
predictions = np.zeros(probs.shape)
predictions[np.where(probs >= 0.5)] = 1
predicted_labels = [model.config.id2label[idx] for idx, label in enumerate(predictions) if label == 1.0]

Training procedure

Training parameter

  • learning_rate: 2e-5
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 0
  • num_epochs: 5
  • warmup_ratio: 0.01
  • weight_decay: 0.001
  • optim: adamw_8bit

Training results

  • global_step=1755
  • train_runtime: 315.2685
  • train_samples_per_second: 177.722
  • train_steps_per_second: 5.567
  • total_flos: 7371850396784640.0
  • train_loss: 0.06630994546787013
Step Training Loss Validation Loss Accuracy
500 0.149800 0.061355 0.986081
1000 0.043700 0.046901 0.988223
1500 0.027700 0.043031 0.988707