End of training

Browse files

Files changed (8) hide show

README.md +36 -22
config.json +5 -12
model.safetensors +2 -2
runs/Mar15_22-18-02_a974890be047/events.out.tfevents.1710541098.a974890be047.147.0 +3 -0
runs/Mar15_22-18-02_a974890be047/events.out.tfevents.1710545075.a974890be047.147.1 +3 -0
tokenizer_config.json +1 -1
training_args.bin +1 -1
vocab.txt +0 -0

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 license: apache-2.0
-base_model: bert-base-uncased
 tags:
 - generated_from_trainer
 metrics:
@@ -18,13 +18,13 @@ should probably proofread and complete it, then remove this comment. -->
 # trainer8
-This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.7360
-- Precision: 0.2889
-- Recall: 0.2738
-- F1: 0.2509
-- Accuracy: 0.2738
 ## Model description
@@ -43,35 +43,49 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 7
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
-| 2.0043        | 0.57  | 30   | 1.9531          | 0.0204    | 0.1429 | 0.0357 | 0.1429   |
-| 1.9325        | 1.13  | 60   | 1.9080          | 0.1615    | 0.2143 | 0.1460 | 0.2143   |
-| 1.8774        | 1.7   | 90   | 1.8684          | 0.3164    | 0.2143 | 0.1910 | 0.2143   |
-| 1.835         | 2.26  | 120  | 1.7834          | 0.1535    | 0.2619 | 0.1825 | 0.2619   |
-| 1.7079        | 2.83  | 150  | 1.7444          | 0.2040    | 0.2857 | 0.2180 | 0.2857   |
-| 1.6089        | 3.4   | 180  | 1.7426          | 0.3652    | 0.3214 | 0.2789 | 0.3214   |
-| 1.6118        | 3.96  | 210  | 1.7444          | 0.2285    | 0.25   | 0.1993 | 0.25     |
-| 1.5414        | 4.53  | 240  | 1.7426          | 0.2625    | 0.3095 | 0.2704 | 0.3095   |
-| 1.5095        | 5.09  | 270  | 1.7443          | 0.3820    | 0.2619 | 0.2186 | 0.2619   |
-| 1.4397        | 5.66  | 300  | 1.7368          | 0.2545    | 0.2976 | 0.2382 | 0.2976   |
-| 1.4319        | 6.23  | 330  | 1.7444          | 0.3844    | 0.2738 | 0.2566 | 0.2738   |
-| 1.4072        | 6.79  | 360  | 1.7384          | 0.3680    | 0.2857 | 0.2588 | 0.2857   |
 ### Framework versions
 - Transformers 4.38.2
-- Pytorch 2.1.0+cu121
 - Datasets 2.18.0
 - Tokenizers 0.15.2

 ---
 license: apache-2.0
+base_model: distilbert-base-cased
 tags:
 - generated_from_trainer
 metrics:
 # trainer8
+This model is a fine-tuned version of [distilbert-base-cased](https://huggingface.co/distilbert-base-cased) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.6219
+- Precision: 0.6754
+- Recall: 0.6190
+- F1: 0.6211
+- Accuracy: 0.6190
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 15
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
+| 1.8672        | 0.57  | 30   | 1.7381          | 0.3395    | 0.3810 | 0.2691 | 0.3810   |
+| 1.5788        | 1.13  | 60   | 1.4116          | 0.3983    | 0.5    | 0.4344 | 0.5      |
+| 1.1325        | 1.7   | 90   | 1.1528          | 0.6029    | 0.6071 | 0.5755 | 0.6071   |
+| 0.7556        | 2.26  | 120  | 0.8986          | 0.6796    | 0.6310 | 0.6237 | 0.6310   |
+| 0.458         | 2.83  | 150  | 0.9989          | 0.6815    | 0.6071 | 0.5981 | 0.6071   |
+| 0.2407        | 3.4   | 180  | 1.2074          | 0.6018    | 0.5476 | 0.5200 | 0.5476   |
+| 0.2018        | 3.96  | 210  | 1.0334          | 0.7163    | 0.6786 | 0.6847 | 0.6786   |
+| 0.0545        | 4.53  | 240  | 1.2405          | 0.6544    | 0.5952 | 0.5899 | 0.5952   |
+| 0.0464        | 5.09  | 270  | 1.1513          | 0.7442    | 0.6905 | 0.6869 | 0.6905   |
+| 0.0105        | 5.66  | 300  | 1.5555          | 0.7304    | 0.6429 | 0.6344 | 0.6429   |
+| 0.025         | 6.23  | 330  | 1.3049          | 0.7119    | 0.6310 | 0.6343 | 0.6310   |
+| 0.0045        | 6.79  | 360  | 1.3200          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0036        | 7.36  | 390  | 1.4460          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0031        | 7.92  | 420  | 1.4770          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0028        | 8.49  | 450  | 1.4846          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0023        | 9.06  | 480  | 1.5149          | 0.6666    | 0.6071 | 0.6086 | 0.6071   |
+| 0.0022        | 9.62  | 510  | 1.5523          | 0.6666    | 0.6071 | 0.6086 | 0.6071   |
+| 0.002         | 10.19 | 540  | 1.5883          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0019        | 10.75 | 570  | 1.6123          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0016        | 11.32 | 600  | 1.6183          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0017        | 11.89 | 630  | 1.6112          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0016        | 12.45 | 660  | 1.6067          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0015        | 13.02 | 690  | 1.6122          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0014        | 13.58 | 720  | 1.6163          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0014        | 14.15 | 750  | 1.6194          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
+| 0.0015        | 14.72 | 780  | 1.6215          | 0.6754    | 0.6190 | 0.6211 | 0.6190   |
 ### Framework versions
 - Transformers 4.38.2
+- Pytorch 2.2.1+cu121
 - Datasets 2.18.0
 - Tokenizers 0.15.2

config.json CHANGED Viewed

@@ -1,17 +1,13 @@
 {
-  "_name_or_path": "bert-base-uncased",
   "activation": "gelu",
   "architectures": [
     "DistilBertForSequenceClassification"
   ],
   "attention_dropout": 0.1,
-  "attention_probs_dropout_prob": 0.1,
   "dim": 768,
   "dropout": 0.1,
-  "gradient_checkpointing": false,
-  "hidden_act": "gelu",
   "hidden_dim": 3072,
-  "hidden_dropout_prob": 0.1,
   "id2label": {
     "0": "anger",
     "1": "fear",
@@ -22,7 +18,6 @@
     "6": "surprise"
   },
   "initializer_range": 0.02,
-  "intermediate_size": 3072,
   "label2id": {
     "LABEL_0": 0,
     "LABEL_1": 1,
@@ -32,20 +27,18 @@
     "LABEL_5": 5,
     "LABEL_6": 6
   },
-  "layer_norm_eps": 1e-12,
   "max_position_embeddings": 512,
   "model_type": "distilbert",
   "n_heads": 12,
-  "n_layers": 12,
   "pad_token_id": 0,
-  "position_embedding_type": "absolute",
   "problem_type": "single_label_classification",
   "qa_dropout": 0.1,
   "seq_classif_dropout": 0.2,
   "sinusoidal_pos_embds": false,
   "torch_dtype": "float32",
   "transformers_version": "4.38.2",
-  "type_vocab_size": 2,
-  "use_cache": true,
-  "vocab_size": 30522
 }

 {
+  "_name_or_path": "distilbert-base-cased",
   "activation": "gelu",
   "architectures": [
     "DistilBertForSequenceClassification"
   ],
   "attention_dropout": 0.1,
   "dim": 768,
   "dropout": 0.1,
   "hidden_dim": 3072,
   "id2label": {
     "0": "anger",
     "1": "fear",
     "6": "surprise"
   },
   "initializer_range": 0.02,
   "label2id": {
     "LABEL_0": 0,
     "LABEL_1": 1,
     "LABEL_5": 5,
     "LABEL_6": 6
   },
   "max_position_embeddings": 512,
   "model_type": "distilbert",
   "n_heads": 12,
+  "n_layers": 6,
+  "output_past": true,
   "pad_token_id": 0,
   "problem_type": "single_label_classification",
   "qa_dropout": 0.1,
   "seq_classif_dropout": 0.2,
   "sinusoidal_pos_embds": false,
+  "tie_weights_": true,
   "torch_dtype": "float32",
   "transformers_version": "4.38.2",
+  "vocab_size": 28996
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:239de5ddc5a7f65549b2ed59583c44ecf6fae1c91e8f0fd6ab6a0522a5c469c5
-size 437968636

 version https://git-lfs.github.com/spec/v1
+oid sha256:9e1ed209bdadb683f8a13a1ef4b0cc12b9167ff7b882a2c9b044ee743b66296c
+size 263160068

runs/Mar15_22-18-02_a974890be047/events.out.tfevents.1710541098.a974890be047.147.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a17ce72ada9bd102199b0da6363af0d2e32f59d5d8c936b63cbcec4add196807
+size 22805

runs/Mar15_22-18-02_a974890be047/events.out.tfevents.1710545075.a974890be047.147.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b6d04509f4fbcd6c4fbc048ce37ea89321510648573ed8645700ce828e14b86a
+size 560

tokenizer_config.json CHANGED Viewed

@@ -44,7 +44,7 @@
   "clean_up_tokenization_spaces": true,
   "cls_token": "[CLS]",
   "do_basic_tokenize": true,
-  "do_lower_case": true,
   "mask_token": "[MASK]",
   "model_max_length": 512,
   "never_split": null,

   "clean_up_tokenization_spaces": true,
   "cls_token": "[CLS]",
   "do_basic_tokenize": true,
+  "do_lower_case": false,
   "mask_token": "[MASK]",
   "model_max_length": 512,
   "never_split": null,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a350d98d9c50022f698d1f446e02d7111b095a5de05b2ea056fe82e41fc3f8ad
 size 4856

 version https://git-lfs.github.com/spec/v1
+oid sha256:0b81dd5c7dc7d95c79dd1b21ce5ce5687837f9d0c7a1ff040444cc6802d45aaf
 size 4856

vocab.txt CHANGED Viewed

The diff for this file is too large to render. See raw diff