--- language: fr license: cc-by-4.0 --- # Cour de Cassation *titrage* prediction model (transformer-base) Model for the automatic prediction of *titrages* (keyword sequence) from *sommaires* (synthesis of legal cases). The models are described in [this paper](https://hal.inria.fr/hal-03663110/file/LREC_2022___CCass_Inria-camera-ready.pdf). If you use this model, please cite our research paper (see [below](#cite)). ## Model description The model is a transformer-base model trained on parallel data (sommaires-titrages) provided by the Cour de Cassation. The model was intially trained using the Fairseq toolkit, converted to HuggingFace and then fine-tuned on the original training data to smooth out minor differences that arose during the conversion process. Tokenisation is performed using a SentencePiece model, the BPE strategy and a vocab size of 8000. ### Intended uses & limitations ### How to use ### Limitations and bias ## Training data ## Training procedure ### Preprocessing ### Training ### Evaluation results Coming soon ## BibTex entry and citation info If you use this work, please cite the following article: Thibault Charmet, Inès Cherichi, Matthieu Allain, Urszula Czerwinska, Amaury Fouret, Benoît Sagot and Rachel Bawden, 2022. **Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings**. In Proceedings of the 13th Language Resources and Evaluation Conference, Marseille, France. ``` @inproceedings{charmet-et-al-2022-complex, tite = {Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France’s Court of Cassation Rulings}, author = {Charmet, Thibault and Cherichi, Inès and Allain, Matthieu and Czerwinska, Urszula and Fouret, Amaury, and Sagot, Benoît and Bawden, Rachel}, booktitle = {Proceedings of the 13th Language Resources and Evaluation Conference}, year = {2022}, address = {Marseille, France} ```