File size: 5,224 Bytes

---
license: agpl-3.0
language:
- it
task_categories:
- token-classification
datasets:
- mrovera/eventnet-ita
tags:
- Frame Parsing
- Event Extraction
---
# EventNet-ITA

The model is a full-text frame parser for events in Italian and it has been trained on [EventNet-ITA](https://huggingface.co/datasets/mrovera/eventnet-ita).
The model can be used for _full-text_ Frame Parsing and Event Extraction.
Please refer to the [paper](https://aclanthology.org/2024.latechclfl-1.9) for a more detailed description.


## Model Details

### Model Description

In its current version, EventNet-ITA is able to recognize and classifiy 205 semantic frames and their (specific) frame elements. The unit of analysis is the sentence. 


### Direct Use

Provided with an input sequence of tokens, the model labels each token with the corresponding frame and/or frame element label(s). 
```
La				B-ENTITY*BEING_LOCATED|B-THEME*CONQUERING
cittadina		I-ENTITY*BEING_LOCATED|I-THEME*CONQUERING
,				O
posta			B-BEING_LOCATED
a				B-RELATIVE_LOCATION*BEING_LOCATED
est				I-RELATIVE_LOCATION*BEING_LOCATED
del				I-RELATIVE_LOCATION*BEING_LOCATED
corso			I-RELATIVE_LOCATION*BEING_LOCATED
d'				I-RELATIVE_LOCATION*BEING_LOCATED
acqua			I-RELATIVE_LOCATION*BEING_LOCATED
,				O
venne			O
conquistata		B-CONQUERING
,				O
ma				O
il				B-EXPLOSIVE*DETONATE_EXPLOSIVE
ponte			I-EXPLOSIVE*DETONATE_EXPLOSIVE
sul				I-EXPLOSIVE*DETONATE_EXPLOSIVE
fiume			I-EXPLOSIVE*DETONATE_EXPLOSIVE
era				O
già				O
stato			O
fatto			B-DETONATE_EXPLOSIVE
saltare			I-DETONATE_EXPLOSIVE
regolarmente	    O
dai				B-AGENT*DETONATE_EXPLOSIVE
genieri			I-AGENT*DETONATE_EXPLOSIVE
francesi		I-AGENT*DETONATE_EXPLOSIVE
.				O
```


## Training Details

The model has been trained using [MaChAmp](https://github.com/machamp-nlp/machamp), a Python tookit supporting a variety of NLP tasks, by fine-tuning [this Italian BERT pretrained model](https://huggingface.co/dbmdz/bert-base-italian-xxl-cased).
Training hyperparameters:
- Batch size: 64
- Learning rate: 1.5e-3

All other hyperparameters have been left unchanged w.r.t. the default MaChAmp configuration for the multi-sequential token classification task.



### Training Data

Please refer to the [dataset repo](https://huggingface.co/datasets/mrovera/eventnet-ita).


### Model Re-training

In order to re-train the model, download the [dataset](https://huggingface.co/datasets/mrovera/eventnet-ita) and follow the instructions for training a [multiseq task](https://github.com/machamp-nlp/machamp/blob/master/docs/multiseq.md) in MaChAmp.


### Inference

EventNet-ITA's model can be used for Frame Parsing on new texts. 
In order to do so, you have to follow a few simple steps.
1. Clone the github repo: `git clone https://github.com/machamp-nlp/machamp.git`
2. Download EventNet-ITA's model from this repo (450 MB) and move it into the `machamp` folder (where is up to you, by default MaChAmp saves trained models in the logs folder)
3. Save the data you want to use for prediction in a two-column tsv file, one word per line, with a placeholder in column 1, each sentence separated by a blank line (without placeholder), like this:
```
This	_
is	_
the	_
first	_
sentence	_
.	_

This	_
is	_
the	_
second	_
one	_
.	_
```
4. Follow the instruction for predicting with [MaChAmp](https://github.com/machamp-nlp/machamp) (see section "Prediction") using a fine-tuned model.

## Evaluation

The model has been evaluated on three folds, each time with a stratified split of the dataset, with a 80/10/10 train/dev/test ratio. Please see the paper for further details. Hereafter we report the synthetic values obtained by averaging the Precision, Recall and F1-score values of the three splits.

**Token-based** (**_relaxed_**) performance:
|                            |    P   |    R    |   F1    |
|----------------------------|--------|---------|---------|
|Frames                      |  0.904 |  0.914  |  **0.907**  |
|Frames (weighted)           |  0.909 |  0.919  |  0.913  |
|Frame Elements              |  0.841 |  0.724  |  **0.761**  |
|Frames Elements (weighted)  |  0.850 |  0.779  |  0.804  |


**Span-based** (**_strict_**) performance:
|                            |    P   |    R    |   F1   |
|----------------------------|--------|---------|--------|
|Frames                      |  0.906 |  0.899  |  **0.901** |
|Frames (weighted)           |  0.909 |  0.903  |  0.905 |
|Frame Elements              |  0.829 |  0.666  |  **0.724** |
|Frames Elements (weighted)  |  0.853 |  0.711  |  0.768 |



### Citation Information

If you use EventNet-ITA, please cite the following paper:

```
@inproceedings{rovera-2024-eventnet,
    title = "{E}vent{N}et-{ITA}: {I}talian Frame Parsing for Events",
    author = "Rovera, Marco",
    editor = "Bizzoni, Yuri  and
      Degaetano-Ortlieb, Stefania  and
      Kazantseva, Anna  and
      Szpakowicz, Stan",
    booktitle = "Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024)",
    year = "2024",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.latechclfl-1.9",
    pages = "77--90",
}
```