|
--- |
|
library_name: transformers |
|
tags: |
|
- legal |
|
license: apache-2.0 |
|
datasets: |
|
- byczong/pl-insurance-terms-struct |
|
language: |
|
- pl |
|
base_model: |
|
- naver-clova-ix/donut-base |
|
pipeline_tag: image-text-to-text |
|
--- |
|
|
|
# Model Card |
|
|
|
Donut fine-tuned for full document structuring (parsing) on [pl-insurance-terms-struct](https://huggingface.co/datasets/byczong/pl-insurance-terms-struct) dataset. |
|
|
|
Trained for 10 epochs with `max_seq_len=7168`. |
|
|
|
- Field-level f1 score: 0.57 |
|
- TED-based accuracy: 0.67 |
|
|
|
|
|
Note: This model and its tokenizer were not (pre-) trained for Polish. |