|
--- |
|
library_name: transformers |
|
metrics: |
|
- f1 |
|
base_model: |
|
- allenai/specter2_base |
|
model-index: |
|
- name: specter2-review-applicant-quantity |
|
results: |
|
- task: |
|
type: text-classification |
|
dataset: |
|
name: validation |
|
type: validation |
|
metrics: |
|
- name: macro-average F1-score |
|
type: macro-average F1-score |
|
value: 0.93 |
|
--- |
|
|
|
# Model: specter2-review-applicant-quantity |
|
|
|
The model `snsf-data/specter2-review-applicant-quantity` is based on the `allenai/specter2_base` model and **fine-tuned for a binary classification** task. |
|
In particular, the model is fine-tuned to classify if a sentence from SNSF grant peer review report is addressing the following aspect: |
|
|
|
***Does the sentence use quantitative indicators to describe the applicant(s) or team?*** |
|
|
|
The model was fine-tuned based on a training set of 2'500 sentences from the SNSF grant peer review reports, which were manually annotated by multiple human annotators via majority rule. |
|
The fine-tuning was performed locally without access to the internet to prevent any potential data leakage or network interference. |
|
The following setup was used for the fine-tuning: |
|
|
|
- Loss function: cross-entropy loss |
|
- Optimizer: AdamW |
|
- Weight decay: 0.01 |
|
- Learning rate: 2e-5 |
|
- Epochs: 3 |
|
- Batch size: 10 |
|
- GPU: NVIDIA RTX A2000 |
|
|
|
The model was then evaluated based on a validation set of 500 sentences, which were also manually annotated by multiple human annotators via majority rule. |
|
The resulting macro-average **F1 score: 0.93** was achieved on the validation set. The share of the outcome label amounts to 1.6%. |
|
|
|
The fine-tuning codes are open-sourced on GitHub: https://github.com/snsf-data/ml-peer-review-analysis . |
|
|
|
Due to data privacy laws no data used for the fine-tuning can be publicly shared. |
|
For a detailed description of data protection please refer to the data management plan underlying this work: https://doi.org/10.46446/DMP-peer-review-assessment-ML. |
|
The annotation codebook is available online: https://doi.org/10.46446/Codebook-peer-review-assessment-ML. |
|
|
|
For more details, see the the following preprint: |
|
|
|
**A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports** |
|
|
|
by [Gabriel Okasa](https://orcid.org/0000-0002-3573-7227), |
|
[Alberto de Le贸n](https://orcid.org/0009-0002-0401-2618), |
|
[Michaela Strinzel](https://orcid.org/0000-0003-3181-0623), |
|
[Anne Jorstad](https://orcid.org/0000-0002-6438-1979), |
|
[Katrin Milzow](https://orcid.org/0009-0002-8959-2534), |
|
[Matthias Egger](https://orcid.org/0000-0001-7462-5132), and |
|
[Stefan M眉ller](https://orcid.org/0000-0002-6315-4125), available on arXiv: https://arxiv.org/abs/2411.16662 . |
|
|
|
## How to Get Started with the Model |
|
|
|
The model can be used to classify sentences from grant peer review reports for addressing the quantitative indicators to describe the applicant(s)or team. |
|
|
|
Use the code below to get started with the model. |
|
|
|
```python |
|
# import transformers library |
|
import transformers |
|
|
|
# load tokenizer from specter2_base - the base model |
|
tokenizer = transformers.AutoTokenizer.from_pretrained("allenai/specter2_base") |
|
|
|
# load the SNSF fine-tuned model for classification of quantitative indicators related to the applicant |
|
model = transformers.AutoModelForSequenceClassification.from_pretrained("snsf-data/specter2-review-applicant-quantity") |
|
|
|
# setup the classification pipeline |
|
classification_pipeline = transformers.TextClassificationPipeline( |
|
model=model, |
|
tokenizer=tokenizer, |
|
return_all_scores=True |
|
) |
|
|
|
# prediction for an example review sentence addressing quantitative indicators related to the applicant |
|
classification_pipeline("He is first author of eight manuscripts and reviews and for four manuscript he is last author, and in addition he is joint last author on one additional manuscript.") |
|
|
|
# prediction for an example review sentence not addressing quantitative indicators related to the applicant |
|
classification_pipeline("There are currently several activities on an international level that have identified the issue and activities are underway.") |
|
``` |
|
|
|
## Model Limitations |
|
- *Human Assessment Required*: This model should not be used for automatic classification of grant peer review reports without human oversight. |
|
- *Limited Training Data*: The model was fine-tuned on a limited sample of 2,500 annotated sentences. Therefore, its classification accuracy should be critically evaluated before deployment. |
|
- *Specific Training Data*: The training data consists of a random sample of SNSF grant peer review reports. As such, the model's external validity to other datasets may be limited. |
|
|
|
## Citation |
|
|
|
**BibTeX:** |
|
|
|
```bibtex |
|
@article{okasa2024supervised, |
|
title={A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports}, |
|
author={Okasa, Gabriel and de Le{\'o}n, Alberto and Strinzel, Michaela and Jorstad, Anne and Milzow, Katrin and Egger, Matthias and M{\"u}ller, Stefan}, |
|
journal={arXiv preprint arXiv:2411.16662}, |
|
year={2024} |
|
} |
|
``` |
|
|
|
**APA:** |
|
|
|
Okasa, G., de Le贸n, A., Strinzel, M., Jorstad, A., Milzow, K., Egger, M., & M眉ller, S. (2024). A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports. arXiv preprint arXiv:2411.16662. |
|
|
|
## Model Card Authors |
|
|
|
[Gabriel Okasa](https://orcid.org/0000-0002-3573-7227), |
|
[Alberto de Le贸n](https://orcid.org/0009-0002-0401-2618), |
|
[Michaela Strinzel](https://orcid.org/0000-0003-3181-0623), |
|
[Anne Jorstad](https://orcid.org/0000-0002-6438-1979), |
|
[Katrin Milzow](https://orcid.org/0009-0002-8959-2534), |
|
[Matthias Egger](https://orcid.org/0000-0001-7462-5132), and |
|
[Stefan M眉ller](https://orcid.org/0000-0002-6315-4125) |
|
|
|
## Model Card Contact |
|
|
|
[email protected] |