Xmm
/

led-large-16384-govreport

@@ -1,11 +1,27 @@
 ---
 tags:
 - generated_from_trainer
 datasets:
 - govreport-summarization
 model-index:
 - name: led-large-16384-govreport
-  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -13,7 +29,13 @@ should probably proofread and complete it, then remove this comment. -->
 # led-large-16384-govreport
-This model is a fine-tuned version of [Xmm/led-large-16384-govreport](https://huggingface.co/Xmm/led-large-16384-govreport) on the govreport-summarization dataset.
 ## Model description
@@ -36,15 +58,23 @@ The following hyperparameters were used during training:
 - train_batch_size: 1
 - eval_batch_size: 1
 - seed: 42
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 5
 ### Framework versions
 - Transformers 4.30.2
-- Pytorch 2.0.1+cu118
-- Datasets 2.13.0
 - Tokenizers 0.13.3

 ---
+license: apache-2.0
 tags:
 - generated_from_trainer
 datasets:
 - govreport-summarization
+metrics:
+- rouge
 model-index:
 - name: led-large-16384-govreport
+  results:
+  - task:
+      name: Sequence-to-sequence Language Modeling
+      type: text2text-generation
+    dataset:
+      name: govreport-summarization
+      type: govreport-summarization
+      config: document
+      split: validation
+      args: document
+    metrics:
+    - name: Rouge1
+      type: rouge
+      value: 0.5194151586540673
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # led-large-16384-govreport
+This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on the govreport-summarization dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.7624
+- Rouge1: 0.5194
+- Rouge2: 0.2107
+- Rougel: 0.2437
+- Rougelsum: 0.2437
 ## Model description
 - train_batch_size: 1
 - eval_batch_size: 1
 - seed: 42
+- gradient_accumulation_steps: 64
+- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 5
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
+| 1.8152        | 3.65  | 500  | 1.7956          | 0.5095 | 0.2040 | 0.2382 | 0.2381    |
+| 1.6981        | 3.66  | 1000 | 1.7624          | 0.5194 | 0.2107 | 0.2437 | 0.2437    |
 ### Framework versions
 - Transformers 4.30.2
+- Pytorch 1.10.0+cu102
+- Datasets 2.13.1
 - Tokenizers 0.13.3

generation_config.json CHANGED Viewed

@@ -8,6 +8,5 @@
   "min_length": 100,
   "no_repeat_ngram_size": 3,
   "pad_token_id": 1,
-  "transformers_version": "4.30.2",
-  "use_cache": false
 }

   "min_length": 100,
   "no_repeat_ngram_size": 3,
   "pad_token_id": 1,
+  "transformers_version": "4.30.2"
 }

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a5e3f08ae57252d5995e179bf72292d15cefb08808e2fb7a22a875b822d73068
 size 647678513

 version https://git-lfs.github.com/spec/v1
+oid sha256:4c87bfe49260a0c04737f255f95c67808e168551376db12181fea52d5064079a
 size 647678513