End of training

Files changed (6) hide show

README.md CHANGED Viewed

@@ -5,8 +5,6 @@ tags:
 - generated_from_trainer
 datasets:
 - glue
-metrics:
-- accuracy
 model-index:
 - name: deberta-v2-xxlarge-finetuned-sst2
   results: []
@@ -18,9 +16,6 @@ should probably proofread and complete it, then remove this comment. -->
 # deberta-v2-xxlarge-finetuned-sst2
 This model is a fine-tuned version of [microsoft/deberta-v2-xxlarge](https://huggingface.co/microsoft/deberta-v2-xxlarge) on the glue dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.6968
-- Accuracy: 0.5092
 ## Model description
@@ -39,21 +34,14 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0005
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 0
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.06
-- num_epochs: 1
-### Training results
-| Training Loss | Epoch | Step  | Validation Loss | Accuracy |
-|:-------------:|:-----:|:-----:|:---------------:|:--------:|
-| 0.679         | 1.0   | 16838 | 0.6968          | 0.5092   |
 ### Framework versions

 - generated_from_trainer
 datasets:
 - glue
 model-index:
 - name: deberta-v2-xxlarge-finetuned-sst2
   results: []
 # deberta-v2-xxlarge-finetuned-sst2
 This model is a fine-tuned version of [microsoft/deberta-v2-xxlarge](https://huggingface.co/microsoft/deberta-v2-xxlarge) on the glue dataset.
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 6e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 0
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.06
+- training_steps: 50
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -7,16 +7,15 @@
   "init_lora_weights": true,
   "layers_pattern": null,
   "layers_to_transform": null,
-  "lora_alpha": 16,
   "lora_dropout": 0.1,
   "modules_to_save": null,
   "peft_type": "LORA",
-  "r": 8,
   "revision": null,
   "target_modules": [
     "query_proj",
-    "value_proj",
-    "key_proj"
   ],
   "task_type": "SEQ_CLS"
 }

   "init_lora_weights": true,
   "layers_pattern": null,
   "layers_to_transform": null,
+  "lora_alpha": 32,
   "lora_dropout": 0.1,
   "modules_to_save": null,
   "peft_type": "LORA",
+  "r": 16,
   "revision": null,
   "target_modules": [
     "query_proj",
+    "value_proj"
   ],
   "task_type": "SEQ_CLS"
 }

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c8974ad25b0368536ed4c7163257b08af86e4bf6ab86a9fc03f6e13de6f4a3e7
-size 14277333

 version https://git-lfs.github.com/spec/v1
+oid sha256:e0c611e560fa76ee60c6b1eb4bb2b3016a02b1bfbed69de576e06662311ea5bb
+size 18959701

tokenizer.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 512,
     "strategy": "LongestFirst",
     "stride": 0
   },

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 128,
     "strategy": "LongestFirst",
     "stride": 0
   },

tokenizer_config.json CHANGED Viewed

@@ -48,7 +48,7 @@
   "do_lower_case": false,
   "eos_token": "[SEP]",
   "mask_token": "[MASK]",
-  "model_max_length": 512,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "sp_model_kwargs": {},

   "do_lower_case": false,
   "eos_token": "[SEP]",
   "mask_token": "[MASK]",
+  "model_max_length": 128,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "sp_model_kwargs": {},

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:629b3d3bc6e861c6a20bc29a33f93a447d1041d2ecfed4bcfa2a7bb3966c568d
 size 4091

 version https://git-lfs.github.com/spec/v1
+oid sha256:8f6c0441a22785abe6b3e8322f3514ad43abd0811ceee57259be27856c8701ae
 size 4091