license: apache-2.0
tags:
- generated_from_trainer
- LLM
- FLAN
- NLP
metrics:
- rouge
model-index:
- name: output
results: []
language:
- en
library_name: transformers
pipeline_tag: text2text-generation
Legal Flan-T5-Base
This model is a fine-tuned version of google/flan-t5-base on an LegalBench dataset. It achieves the following results on the evaluation set:
- Loss: 1.1885
- Rouge1: 65.4762
- Rouge2: 0.0
- Rougel: 65.4762
- Rougelsum: 65.4762
- Gen Len: 2.1905
Model description
We finetune Flan-T5-Base LLM on the LegalBench.
Prompt
The prompt should be formatted as follows: {{Question}} {{Clause}}
Question: Does the clause grant one party an “enterprise,” “all you can eat” or unlimited usage license?
Clause: Except as the parties may otherwise agree in writing, Converge, to the extent it has the legal right to do so, hereby grants to Vert an irrevocable, perpetual, world-wide, non-exclusive right and license to use, load, store, transmit, execute, copy, market, distribute, in any medium or distribution technology whatsoever, known or unknown, display, perform and sublicense the Converge-Independent Materials and the Third-Party Materials, in both Source Code and Object Code formats, and to make unlimited instantiations thereof, for any and all purposes.
Prompt: Does the clause grant one party an “enterprise,” “all you can eat” or unlimited usage license? Except as the parties may otherwise agree in writing, Converge, to the extent it has the legal right to do so, hereby grants to Vert an irrevocable, perpetual, world-wide, non-exclusive right and license to use, load, store, transmit, execute, copy, market, distribute, in any medium or distribution technology whatsoever, known or unknown, display, perform and sublicense the Converge-Independent Materials and the Third-Party Materials, in both Source Code and Object Code formats, and to make unlimited instantiations thereof, for any and all purposes.
Intended uses & limitations
More information needed
Training and evaluation data
We used LegalBench for training and evaluation.
Training procedure
Tutorial: Finetune Flan-T5 with Baseten.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 20
- num_epochs: 50
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
1.2679 | 1.0 | 42 | 1.3033 | 48.8095 | 0.0 | 48.8095 | 48.8095 | 4.0119 |
1.0917 | 2.0 | 84 | 1.1075 | 48.8095 | 0.0 | 48.8095 | 48.8095 | 2.2738 |
0.8305 | 3.0 | 126 | 1.0366 | 45.2381 | 0.0 | 45.2381 | 45.2381 | 2.3095 |
0.6058 | 4.0 | 168 | 0.9865 | 48.8095 | 0.0 | 48.8095 | 48.8095 | 2.4524 |
0.5114 | 5.0 | 210 | 0.9289 | 55.9524 | 0.0 | 55.9524 | 55.9524 | 2.4048 |
0.6026 | 6.0 | 252 | 0.9373 | 53.5714 | 0.0 | 53.5714 | 53.5714 | 2.3214 |
0.6428 | 7.0 | 294 | 0.8762 | 53.5714 | 0.0 | 53.5714 | 53.5714 | 2.3095 |
0.5375 | 8.0 | 336 | 0.8908 | 54.7619 | 0.0 | 54.7619 | 54.7619 | 2.3333 |
0.4296 | 9.0 | 378 | 0.9172 | 50.0 | 0.0 | 50.0 | 50.0 | 2.3452 |
0.4644 | 10.0 | 420 | 0.8882 | 60.7143 | 0.0 | 60.7143 | 60.7143 | 2.3452 |
0.42 | 11.0 | 462 | 0.8917 | 54.7619 | 0.0 | 54.7619 | 54.7619 | 2.2619 |
0.3727 | 12.0 | 504 | 0.8710 | 55.9524 | 0.0 | 55.9524 | 55.9524 | 2.3571 |
0.4061 | 13.0 | 546 | 0.8817 | 54.7619 | 0.0 | 54.7619 | 54.7619 | 2.2857 |
0.3221 | 14.0 | 588 | 0.9284 | 57.1429 | 0.0 | 57.1429 | 57.1429 | 2.2857 |
0.3676 | 15.0 | 630 | 0.9313 | 57.1429 | 0.0 | 57.1429 | 57.1429 | 2.0476 |
0.264 | 16.0 | 672 | 0.9315 | 59.5238 | 0.0 | 59.5238 | 59.5238 | 2.0595 |
0.2933 | 17.0 | 714 | 0.9265 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1310 |
0.2446 | 18.0 | 756 | 0.9254 | 61.9048 | 0.0 | 61.9048 | 61.9048 | 2.0714 |
0.2356 | 19.0 | 798 | 0.9390 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.0714 |
0.3102 | 20.0 | 840 | 0.9837 | 61.9048 | 0.0 | 61.9048 | 61.9048 | 2.1071 |
0.1539 | 21.0 | 882 | 0.9727 | 60.7143 | 0.0 | 60.7143 | 60.7143 | 2.0952 |
0.1674 | 22.0 | 924 | 1.0114 | 61.9048 | 0.0 | 61.9048 | 61.9048 | 2.0952 |
0.1831 | 23.0 | 966 | 0.9869 | 61.9048 | 0.0 | 61.9048 | 61.9048 | 2.0595 |
0.201 | 24.0 | 1008 | 0.9904 | 60.7143 | 0.0 | 60.7143 | 60.7143 | 2.0595 |
0.1602 | 25.0 | 1050 | 0.9883 | 60.7143 | 0.0 | 60.7143 | 60.7143 | 2.0595 |
0.158 | 26.0 | 1092 | 1.0057 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.1071 |
0.1468 | 27.0 | 1134 | 0.9998 | 67.8571 | 0.0 | 67.8571 | 67.8571 | 2.1429 |
0.109 | 28.0 | 1176 | 1.0052 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3333 |
0.1397 | 29.0 | 1218 | 1.0137 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.3333 |
0.1204 | 30.0 | 1260 | 1.0482 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3452 |
0.1577 | 31.0 | 1302 | 1.0787 | 66.6667 | 0.0 | 66.6667 | 66.6667 | 2.3452 |
0.1112 | 32.0 | 1344 | 1.0513 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3452 |
0.0932 | 33.0 | 1386 | 1.0786 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3452 |
0.0989 | 34.0 | 1428 | 1.1378 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3452 |
0.0858 | 35.0 | 1470 | 1.1055 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.3452 |
0.1056 | 36.0 | 1512 | 1.1297 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.3571 |
0.14 | 37.0 | 1554 | 1.1604 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.3452 |
0.0592 | 38.0 | 1596 | 1.1213 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.3452 |
0.1121 | 39.0 | 1638 | 1.1489 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.3452 |
0.1917 | 40.0 | 1680 | 1.1544 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.3452 |
0.1178 | 41.0 | 1722 | 1.1561 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.3452 |
0.0761 | 42.0 | 1764 | 1.2013 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.1905 |
0.0911 | 43.0 | 1806 | 1.2075 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1548 |
0.1081 | 44.0 | 1848 | 1.2134 | 66.6667 | 0.0 | 66.6667 | 66.6667 | 2.1548 |
0.089 | 45.0 | 1890 | 1.1861 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1905 |
0.0828 | 46.0 | 1932 | 1.1988 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.1905 |
0.0818 | 47.0 | 1974 | 1.1886 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1905 |
0.0899 | 48.0 | 2016 | 1.1988 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1905 |
0.0923 | 49.0 | 2058 | 1.1968 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.1905 |
0.0859 | 50.0 | 2100 | 1.1885 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.1905 |
Framework versions
- Transformers 4.26.1
- Pytorch 1.13.1+cu117
- Datasets 2.10.1
- Tokenizers 0.13.2