Amirhossein Nazeri commited on
Commit
ba273ab
·
verified ·
1 Parent(s): ebe7a2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -6
README.md CHANGED
@@ -14,16 +14,55 @@ model-index:
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
- # spam_not_spam
18
 
19
- This model is a fine-tuned version of [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) on an unknown dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.0414
22
- - Accuracy: 0.9839
23
 
24
  ## Model description
25
 
26
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  ## Intended uses & limitations
29
 
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
  should probably proofread and complete it, then remove this comment. -->
16
 
17
+ # RoBERTa-PEFT-ForSequenceClassification
18
 
19
+ This model is a fine-tuned version of [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) on `spam_not_spam` dataset.
20
+ It achieves the following results on the evaluation set: \
21
+ *- Loss: 0.0414* \
22
+ *- Accuracy: 0.9839*
23
 
24
  ## Model description
25
 
26
+ ### Performing Parameter-Efficient Fine-Tuning
27
+ We used low rank adaptation (LoRA) from PEFT library in HuggineFace. \
28
+ Base-model is finetuned using LoRA config below: \
29
+ `peft_config = LoraConfig(task_type=TaskType.SEQ_CLS, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1)`
30
+
31
+ ### Training
32
+ Use script below for model fine-tuning:
33
+
34
+ ```
35
+ def compute_metrics(eval_pred):
36
+ predictions, labels = eval_pred
37
+ predictions = np.argmax(predictions, axis=1)
38
+ return {"accuracy": (predictions == labels).mean()}
39
+
40
+ trainer = Trainer(
41
+ model=lora_model,
42
+ args=TrainingArguments(
43
+ output_dir="./data/spam_not_spam",
44
+ # Set the learning rate
45
+ learning_rate = 2e-5,
46
+ # Set the per device train batch size and eval batch size
47
+ per_device_train_batch_size=16,
48
+ per_device_eval_batch_size=64,
49
+ # Evaluate and save the model after each epoch
50
+ evaluation_strategy = "epoch",
51
+ save_strategy = "epoch",
52
+ num_train_epochs=5,
53
+ weight_decay=0.01,
54
+ load_best_model_at_end=True,
55
+ ),
56
+ train_dataset= tokenized_dataset_train,
57
+ eval_dataset= tokenized_dataset_test,
58
+ tokenizer=tokenizer,
59
+ data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
60
+ compute_metrics=compute_metrics,
61
+ )
62
+ ```
63
+
64
+
65
+
66
 
67
  ## Intended uses & limitations
68