rbawden's picture
Update README.md
697e66e
|
raw
history blame
1.99 kB
---
language: fr
license: cc-by-4.0
---
# Cour de Cassation *titrage* prediction model (transformer-base)
Model for the automatic prediction of *titrages* (keyword sequence) from *sommaires* (synthesis of legal cases). The models are described in [this paper](https://hal.inria.fr/hal-03663110/file/LREC_2022___CCass_Inria-camera-ready.pdf). If you use this model, please cite our research paper (see [below](#cite)).
## Model description
The model is a transformer-base model trained on parallel data (sommaires-titrages) provided by the Cour de Cassation. The model was intially trained using the Fairseq toolkit, converted to HuggingFace and then fine-tuned on the original training data to smooth out minor differences that arose during the conversion process. Tokenisation is performed using a SentencePiece model, the BPE strategy and a vocab size of 8000.
### Intended uses & limitations
### How to use
### Limitations and bias
## Training data
## Training procedure
### Preprocessing
### Training
### Evaluation results
Coming soon
## BibTex entry and citation info
<a name="cite"></a>
If you use this work, please cite the following article:
Thibault Charmet, Inès Cherichi, Matthieu Allain, Urszula Czerwinska, Amaury Fouret, Benoît Sagot and Rachel Bawden, 2022. **Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings**. In Proceedings of the 13th Language Resources and Evaluation Conference, Marseille, France.
```
@inproceedings{charmet-et-al-2022-complex,
tite = {Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings},
author = {Charmet, Thibault and Cherichi, Inès and Allain, Matthieu and Czerwinska, Urszula and Fouret, Amaury, and Sagot, Benoît and Bawden, Rachel},
booktitle = {Proceedings of the 13th Language Resources and Evaluation Conference},
year = {2022},
address = {Marseille, France}
```