File size: 824 Bytes
080f221
 
 
7e87f12
 
 
9c8bf00
080f221
 
 
e4354ea
080f221
e4354ea
452781c
e4354ea
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
---
language:
- fi
tags:
- simplification
- mBART
library_name: fairseq
---

This is a finetuned mBART model (https://github.com/facebookresearch/fairseq/tree/main/examples/mbart) suitable for Finnish sentence simplification. 
The checkpoint is a **fairseq** checkpoint. PID on Kielipankki: http://urn.fi/urn:nbn:fi:lb-2024011801.

Paper: [Towards Automatic Finnish Text Simplification](https://aclanthology.org/2024.determit-1.4.pdf) (Dmitrieva & Tiedemann, DeTermIt-WS 2024).

The finetuning data can be obtained here: http://urn.fi/urn:nbn:fi:lb-2024011703. If you wish to replicate the results, you can find the training, validation, and testing sentence pairs' ids in the "splits.zip" archive in this repository. The ids contain the following information: "{regular text id}\__{simple text id}__{sentence pair number}".