okasag's picture
Adding model card for specter2-review-applicant
af713ee verified
metadata
library_name: transformers
metrics:
  - f1
base_model:
  - allenai/specter2_base
model-index:
  - name: specter2-review-applicant
    results:
      - task:
          type: text-classification
        dataset:
          name: validation
          type: validation
        metrics:
          - name: macro-average F1-score
            type: macro-average F1-score
            value: 0.91

Model: specter2-review-applicant

The model snsf-data/specter2-review-applicant is based on the allenai/specter2_base model and fine-tuned for a binary classification task. In particular, the model is fine-tuned to classify if a sentence from SNSF grant peer review report is addressing the following aspect:

Does the sentence address the applicant(s)/team or their qualifications, without mentioning quantitative indicators?

The model was fine-tuned based on a training set of 2'500 sentences from the SNSF grant peer review reports, which were manually annotated by multiple human annotators via majority rule. The fine-tuning was performed locally without access to the internet to prevent any potential data leakage or network interference. The following setup was used for the fine-tuning:

  • Loss function: cross-entropy loss
  • Optimizer: AdamW
  • Weight decay: 0.01
  • Learning rate: 2e-5
  • Epochs: 3
  • Batch size: 10
  • GPU: NVIDIA RTX A2000

The model was then evaluated based on a validation set of 500 sentences, which were also manually annotated by multiple human annotators via majority rule. The resulting macro-average F1 score: 0.91 was achieved on the validation set. The share of the outcome label amounts to 19.7%.

The fine-tuning codes are open-sourced on GitHub: https://github.com/snsf-data/ml-peer-review-analysis .

Due to data privacy laws no data used for the fine-tuning can be publicly shared. For a detailed description of data protection please refer to the data management plan underlying this work: https://doi.org/10.46446/DMP-peer-review-assessment-ML. The annotation codebook is available online: https://doi.org/10.46446/Codebook-peer-review-assessment-ML.

For more details, see the the following preprint:

A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports

by Gabriel Okasa, Alberto de Le贸n, Michaela Strinzel, Anne Jorstad, Katrin Milzow, Matthias Egger, and Stefan M眉ller, available on arXiv: https://arxiv.org/abs/2411.16662 .

How to Get Started with the Model

The model can be used to classify sentences from grant peer review reports for addressing the applicant(s)/team or their qualifications, without mentioning quantitative indicators.

Use the code below to get started with the model.

# import transformers library
import transformers

# load tokenizer from specter2_base - the base model
tokenizer = transformers.AutoTokenizer.from_pretrained("allenai/specter2_base")

# load the SNSF fine-tuned model for classification of the applicant in review texts
model = transformers.AutoModelForSequenceClassification.from_pretrained("snsf-data/specter2-review-applicant")

# setup the classification pipeline
classification_pipeline = transformers.TextClassificationPipeline(
    model=model,
    tokenizer=tokenizer,
    return_all_scores=True
)

# prediction for an example review sentence addressing the applicant
classification_pipeline("Therefore, they have ability to carry out the proposed project based on their strong expertise in this special research area and their outstanding track record.")

# prediction for an example review sentence not addressing the applicant
classification_pipeline("There are currently several activities on an international level that have identified the issue and activities are underway.")

Citation

BibTeX:

@article{okasa2024supervised,
  title={A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports},
  author={Okasa, Gabriel and de Le{\'o}n, Alberto and Strinzel, Michaela and Jorstad, Anne and Milzow, Katrin and Egger, Matthias and M{\"u}ller, Stefan},
  journal={arXiv preprint arXiv:2411.16662},
  year={2024}
}

APA:

Okasa, G., de Le贸n, A., Strinzel, M., Jorstad, A., Milzow, K., Egger, M., & M眉ller, S. (2024). A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports. arXiv preprint arXiv:2411.16662.

Model Card Authors

Gabriel Okasa, Alberto de Le贸n, Michaela Strinzel, Anne Jorstad, Katrin Milzow, Matthias Egger, and Stefan M眉ller

Model Card Contact

[email protected]