metadata

language: fr
license: cc-by-4.0

Cour de Cassation titrage prediction model (transformer-base)

Model for the automatic prediction of titrages (keyword sequence) from sommaires (synthesis of legal cases). The models are described in this paper. If you use this model, please cite our research paper (see below).

Model description

The model is a transformer-base model trained on parallel data (sommaires-titrages) provided by the Cour de Cassation. The model was intially trained using the Fairseq toolkit, converted to HuggingFace and then fine-tuned on the original training data to smooth out minor differences that arose during the conversion process. Tokenisation is performed using a SentencePiece model, the BPE strategy and a vocab size of 8000.

Intended uses & limitations

How to use

Limitations and bias

Training data

Training procedure

Preprocessing

Training

Evaluation results

Coming soon

BibTex entry and citation info

If you use this work, please cite the following article:

Thibault Charmet, Inès Cherichi, Matthieu Allain, Urszula Czerwinska, Amaury Fouret, Benoît Sagot and Rachel Bawden, 2022. Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings. In Proceedings of the 13th Language Resources and Evaluation Conference, Marseille, France.

@inproceedings{charmet-et-al-2022-complex,
  tite = {Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings},
  author = {Charmet, Thibault and Cherichi, Inès and Allain, Matthieu and Czerwinska, Urszula and Fouret, Amaury, and Sagot, Benoît and Bawden, Rachel},
  booktitle = {Proceedings of the 13th Language Resources and Evaluation Conference},
  year = {2022},
  address = {Marseille, France}