--- license: apache-2.0 tags: - generated_from_trainer - LLM - FLAN - NLP metrics: - rouge model-index: - name: output results: [] language: - en library_name: transformers --- # output This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an [LegalBench](https://github.com/HazyResearch/legalbench) dataset. It achieves the following results on the evaluation set: - Loss: 1.1885 - Rouge1: 65.4762 - Rouge2: 0.0 - Rougel: 65.4762 - Rougelsum: 65.4762 - Gen Len: 2.1905 ## Model description We finetune [Flan-T5-Base]((https://huggingface.co/google/flan-t5-base)) LLM on the [LegalBench](https://github.com/HazyResearch/legalbench). ### Prompt The prompt should be formatted as follows: {{Question}} {{Clause}} Question: Does the clause grant one party an “enterprise,” “all you can eat” or unlimited usage license? Clause: Except as the parties may otherwise agree in writing, Converge, to the extent it has the legal right to do so, hereby grants to Vert an irrevocable, perpetual, world-wide, non-exclusive right and license to use, load, store, transmit, execute, copy, market, distribute, in any medium or distribution technology whatsoever, known or unknown, display, perform and sublicense the Converge-Independent Materials and the Third-Party Materials, in both Source Code and Object Code formats, and to make unlimited instantiations thereof, for any and all purposes. Prompt: Does the clause grant one party an “enterprise,” “all you can eat” or unlimited usage license? Except as the parties may otherwise agree in writing, Converge, to the extent it has the legal right to do so, hereby grants to Vert an irrevocable, perpetual, world-wide, non-exclusive right and license to use, load, store, transmit, execute, copy, market, distribute, in any medium or distribution technology whatsoever, known or unknown, display, perform and sublicense the Converge-Independent Materials and the Third-Party Materials, in both Source Code and Object Code formats, and to make unlimited instantiations thereof, for any and all purposes. ## Intended uses & limitations More information needed ## Training and evaluation data We used [LegalBench](https://github.com/HazyResearch/legalbench) for training and evaluation. ## Training procedure Tutorial: Finetune [Flan-T5](https://docs.blueprint.baseten.co/finetuning/flan-t5/) with Baseten. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 20 - num_epochs: 50 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|:---------:|:-------:| | 1.2679 | 1.0 | 42 | 1.3033 | 48.8095 | 0.0 | 48.8095 | 48.8095 | 4.0119 | | 1.0917 | 2.0 | 84 | 1.1075 | 48.8095 | 0.0 | 48.8095 | 48.8095 | 2.2738 | | 0.8305 | 3.0 | 126 | 1.0366 | 45.2381 | 0.0 | 45.2381 | 45.2381 | 2.3095 | | 0.6058 | 4.0 | 168 | 0.9865 | 48.8095 | 0.0 | 48.8095 | 48.8095 | 2.4524 | | 0.5114 | 5.0 | 210 | 0.9289 | 55.9524 | 0.0 | 55.9524 | 55.9524 | 2.4048 | | 0.6026 | 6.0 | 252 | 0.9373 | 53.5714 | 0.0 | 53.5714 | 53.5714 | 2.3214 | | 0.6428 | 7.0 | 294 | 0.8762 | 53.5714 | 0.0 | 53.5714 | 53.5714 | 2.3095 | | 0.5375 | 8.0 | 336 | 0.8908 | 54.7619 | 0.0 | 54.7619 | 54.7619 | 2.3333 | | 0.4296 | 9.0 | 378 | 0.9172 | 50.0 | 0.0 | 50.0 | 50.0 | 2.3452 | | 0.4644 | 10.0 | 420 | 0.8882 | 60.7143 | 0.0 | 60.7143 | 60.7143 | 2.3452 | | 0.42 | 11.0 | 462 | 0.8917 | 54.7619 | 0.0 | 54.7619 | 54.7619 | 2.2619 | | 0.3727 | 12.0 | 504 | 0.8710 | 55.9524 | 0.0 | 55.9524 | 55.9524 | 2.3571 | | 0.4061 | 13.0 | 546 | 0.8817 | 54.7619 | 0.0 | 54.7619 | 54.7619 | 2.2857 | | 0.3221 | 14.0 | 588 | 0.9284 | 57.1429 | 0.0 | 57.1429 | 57.1429 | 2.2857 | | 0.3676 | 15.0 | 630 | 0.9313 | 57.1429 | 0.0 | 57.1429 | 57.1429 | 2.0476 | | 0.264 | 16.0 | 672 | 0.9315 | 59.5238 | 0.0 | 59.5238 | 59.5238 | 2.0595 | | 0.2933 | 17.0 | 714 | 0.9265 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1310 | | 0.2446 | 18.0 | 756 | 0.9254 | 61.9048 | 0.0 | 61.9048 | 61.9048 | 2.0714 | | 0.2356 | 19.0 | 798 | 0.9390 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.0714 | | 0.3102 | 20.0 | 840 | 0.9837 | 61.9048 | 0.0 | 61.9048 | 61.9048 | 2.1071 | | 0.1539 | 21.0 | 882 | 0.9727 | 60.7143 | 0.0 | 60.7143 | 60.7143 | 2.0952 | | 0.1674 | 22.0 | 924 | 1.0114 | 61.9048 | 0.0 | 61.9048 | 61.9048 | 2.0952 | | 0.1831 | 23.0 | 966 | 0.9869 | 61.9048 | 0.0 | 61.9048 | 61.9048 | 2.0595 | | 0.201 | 24.0 | 1008 | 0.9904 | 60.7143 | 0.0 | 60.7143 | 60.7143 | 2.0595 | | 0.1602 | 25.0 | 1050 | 0.9883 | 60.7143 | 0.0 | 60.7143 | 60.7143 | 2.0595 | | 0.158 | 26.0 | 1092 | 1.0057 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.1071 | | 0.1468 | 27.0 | 1134 | 0.9998 | 67.8571 | 0.0 | 67.8571 | 67.8571 | 2.1429 | | 0.109 | 28.0 | 1176 | 1.0052 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3333 | | 0.1397 | 29.0 | 1218 | 1.0137 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.3333 | | 0.1204 | 30.0 | 1260 | 1.0482 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3452 | | 0.1577 | 31.0 | 1302 | 1.0787 | 66.6667 | 0.0 | 66.6667 | 66.6667 | 2.3452 | | 0.1112 | 32.0 | 1344 | 1.0513 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3452 | | 0.0932 | 33.0 | 1386 | 1.0786 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3452 | | 0.0989 | 34.0 | 1428 | 1.1378 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.3452 | | 0.0858 | 35.0 | 1470 | 1.1055 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.3452 | | 0.1056 | 36.0 | 1512 | 1.1297 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.3571 | | 0.14 | 37.0 | 1554 | 1.1604 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.3452 | | 0.0592 | 38.0 | 1596 | 1.1213 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.3452 | | 0.1121 | 39.0 | 1638 | 1.1489 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.3452 | | 0.1917 | 40.0 | 1680 | 1.1544 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.3452 | | 0.1178 | 41.0 | 1722 | 1.1561 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.3452 | | 0.0761 | 42.0 | 1764 | 1.2013 | 63.0952 | 0.0 | 63.0952 | 63.0952 | 2.1905 | | 0.0911 | 43.0 | 1806 | 1.2075 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1548 | | 0.1081 | 44.0 | 1848 | 1.2134 | 66.6667 | 0.0 | 66.6667 | 66.6667 | 2.1548 | | 0.089 | 45.0 | 1890 | 1.1861 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1905 | | 0.0828 | 46.0 | 1932 | 1.1988 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.1905 | | 0.0818 | 47.0 | 1974 | 1.1886 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1905 | | 0.0899 | 48.0 | 2016 | 1.1988 | 64.2857 | 0.0 | 64.2857 | 64.2857 | 2.1905 | | 0.0923 | 49.0 | 2058 | 1.1968 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.1905 | | 0.0859 | 50.0 | 2100 | 1.1885 | 65.4762 | 0.0 | 65.4762 | 65.4762 | 2.1905 | ### Framework versions - Transformers 4.26.1 - Pytorch 1.13.1+cu117 - Datasets 2.10.1 - Tokenizers 0.13.2