---
library_name: transformers
metrics:
- f1
base_model:
- allenai/specter2_base
model-index:
  - name: specter2-review-applicant
    results:
      - task:
          type: text-classification
        dataset:
          name: validation
          type: validation
        metrics:
          - name: macro-average F1-score
            type: macro-average F1-score
            value: 0.91
---

# Model: specter2-review-applicant

The model `snsf-data/specter2-review-applicant` is based on the `allenai/specter2_base` model and **fine-tuned for a binary classification** task.
In particular, the model is fine-tuned to classify if a sentence from SNSF grant peer review report is addressing the following aspect:

***Does the sentence address the applicant(s)/team or their qualifications, without mentioning quantitative indicators?***

The model was fine-tuned based on a training set of 2'500 sentences from the SNSF grant peer review reports, which were manually annotated by multiple human annotators via majority rule.
The fine-tuning was performed locally without access to the internet to prevent any potential data leakage or network interference.
The following setup was used for the fine-tuning:

- Loss function: cross-entropy loss
- Optimizer: AdamW
- Weight decay: 0.01
- Learning rate: 2e-5
- Epochs: 3
- Batch size: 10
- GPU: NVIDIA RTX A2000

The model was then evaluated based on a validation set of 500 sentences, which were also manually annotated by multiple human annotators via majority rule.
The resulting macro-average **F1 score: 0.91** was achieved on the validation set. The share of the outcome label amounts to 19.7%.

The fine-tuning codes are open-sourced on GitHub: https://github.com/snsf-data/ml-peer-review-analysis .

Due to data privacy laws no data used for the fine-tuning can be publicly shared.
For a detailed description of data protection please refer to the data management plan underlying this work: https://doi.org/10.46446/DMP-peer-review-assessment-ML.
The annotation codebook is available online: https://doi.org/10.46446/Codebook-peer-review-assessment-ML.

For more details, see the the following preprint:

**A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports**

by [Gabriel Okasa](https://orcid.org/0000-0002-3573-7227),
[Alberto de León](https://orcid.org/0009-0002-0401-2618),
[Michaela Strinzel](https://orcid.org/0000-0003-3181-0623),
[Anne Jorstad](https://orcid.org/0000-0002-6438-1979),
[Katrin Milzow](https://orcid.org/0009-0002-8959-2534),
[Matthias Egger](https://orcid.org/0000-0001-7462-5132), and
[Stefan Müller](https://orcid.org/0000-0002-6315-4125), available on arXiv: https://arxiv.org/abs/2411.16662 .

## How to Get Started with the Model

The model can be used to classify sentences from grant peer review reports for addressing the applicant(s)/team or their qualifications, without mentioning quantitative indicators.

Use the code below to get started with the model.

```python
# import transformers library
import transformers

# load tokenizer from specter2_base - the base model
tokenizer = transformers.AutoTokenizer.from_pretrained("allenai/specter2_base")

# load the SNSF fine-tuned model for classification of the applicant in review texts
model = transformers.AutoModelForSequenceClassification.from_pretrained("snsf-data/specter2-review-applicant")

# setup the classification pipeline
classification_pipeline = transformers.TextClassificationPipeline(
    model=model,
    tokenizer=tokenizer,
    return_all_scores=True
)

# prediction for an example review sentence addressing the applicant
classification_pipeline("Therefore, they have ability to carry out the proposed project based on their strong expertise in this special research area and their outstanding track record.")

# prediction for an example review sentence not addressing the applicant
classification_pipeline("There are currently several activities on an international level that have identified the issue and activities are underway.")
```

## Model Limitations
- *Human Assessment Required*: This model should not be used for automatic classification of grant peer review reports without human oversight.
- *Limited Training Data*: The model was fine-tuned on a limited sample of 2,500 annotated sentences. Therefore, its classification accuracy should be critically evaluated before deployment.
- *Specific Training Data*: The training data consists of a random sample of SNSF grant peer review reports. As such, the model's external validity to other datasets may be limited.

## Citation

**BibTeX:**

```bibtex
@article{okasa2024supervised,
  title={A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports},
  author={Okasa, Gabriel and de Le{\'o}n, Alberto and Strinzel, Michaela and Jorstad, Anne and Milzow, Katrin and Egger, Matthias and M{\"u}ller, Stefan},
  journal={arXiv preprint arXiv:2411.16662},
  year={2024}
}
```

**APA:**

Okasa, G., de León, A., Strinzel, M., Jorstad, A., Milzow, K., Egger, M., & Müller, S. (2024). A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports. arXiv preprint arXiv:2411.16662.

## Model Card Authors

[Gabriel Okasa](https://orcid.org/0000-0002-3573-7227),
[Alberto de León](https://orcid.org/0009-0002-0401-2618),
[Michaela Strinzel](https://orcid.org/0000-0003-3181-0623),
[Anne Jorstad](https://orcid.org/0000-0002-6438-1979),
[Katrin Milzow](https://orcid.org/0009-0002-8959-2534),
[Matthias Egger](https://orcid.org/0000-0001-7462-5132), and
[Stefan Müller](https://orcid.org/0000-0002-6315-4125)

## Model Card Contact

gabriel.okasa@snf.ch