--- library_name: transformers metrics: - f1 base_model: - allenai/specter2_base model-index: - name: specter2-review-applicant results: - task: type: text-classification dataset: name: validation type: validation metrics: - name: macro-average F1-score type: macro-average F1-score value: 0.91 --- # Model: specter2-review-applicant The model `snsf-data/specter2-review-applicant` is based on the `allenai/specter2_base` model and **fine-tuned for a binary classification** task. In particular, the model is fine-tuned to classify if a sentence from SNSF grant peer review report is addressing the following aspect: ***Does the sentence address the applicant(s)/team or their qualifications, without mentioning quantitative indicators?*** The model was fine-tuned based on a training set of 2'500 sentences from the SNSF grant peer review reports, which were manually annotated by multiple human annotators via majority rule. The fine-tuning was performed locally without access to the internet to prevent any potential data leakage or network interference. The following setup was used for the fine-tuning: - Loss function: cross-entropy loss - Optimizer: AdamW - Weight decay: 0.01 - Learning rate: 2e-5 - Epochs: 3 - Batch size: 10 - GPU: NVIDIA RTX A2000 The model was then evaluated based on a validation set of 500 sentences, which were also manually annotated by multiple human annotators via majority rule. The resulting macro-average **F1 score: 0.91** was achieved on the validation set. The share of the outcome label amounts to 19.7%. The fine-tuning codes are open-sourced on GitHub: https://github.com/snsf-data/ml-peer-review-analysis . Due to data privacy laws no data used for the fine-tuning can be publicly shared. For a detailed description of data protection please refer to the data management plan underlying this work: https://doi.org/10.46446/DMP-peer-review-assessment-ML. The annotation codebook is available online: https://doi.org/10.46446/Codebook-peer-review-assessment-ML. For more details, see the the following preprint: **A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports** by [Gabriel Okasa](https://orcid.org/0000-0002-3573-7227), [Alberto de León](https://orcid.org/0009-0002-0401-2618), [Michaela Strinzel](https://orcid.org/0000-0003-3181-0623), [Anne Jorstad](https://orcid.org/0000-0002-6438-1979), [Katrin Milzow](https://orcid.org/0009-0002-8959-2534), [Matthias Egger](https://orcid.org/0000-0001-7462-5132), and [Stefan Müller](https://orcid.org/0000-0002-6315-4125), available on arXiv: https://arxiv.org/abs/2411.16662 . ## How to Get Started with the Model The model can be used to classify sentences from grant peer review reports for addressing the applicant(s)/team or their qualifications, without mentioning quantitative indicators. Use the code below to get started with the model. ```python # import transformers library import transformers # load tokenizer from specter2_base - the base model tokenizer = transformers.AutoTokenizer.from_pretrained("allenai/specter2_base") # load the SNSF fine-tuned model for classification of the applicant in review texts model = transformers.AutoModelForSequenceClassification.from_pretrained("snsf-data/specter2-review-applicant") # setup the classification pipeline classification_pipeline = transformers.TextClassificationPipeline( model=model, tokenizer=tokenizer, return_all_scores=True ) # prediction for an example review sentence addressing the applicant classification_pipeline("Therefore, they have ability to carry out the proposed project based on their strong expertise in this special research area and their outstanding track record.") # prediction for an example review sentence not addressing the applicant classification_pipeline("There are currently several activities on an international level that have identified the issue and activities are underway.") ``` ## Model Limitations - *Human Assessment Required*: This model should not be used for automatic classification of grant peer review reports without human oversight. - *Limited Training Data*: The model was fine-tuned on a limited sample of 2,500 annotated sentences. Therefore, its classification accuracy should be critically evaluated before deployment. - *Specific Training Data*: The training data consists of a random sample of SNSF grant peer review reports. As such, the model's external validity to other datasets may be limited. ## Citation **BibTeX:** ```bibtex @article{okasa2024supervised, title={A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports}, author={Okasa, Gabriel and de Le{\'o}n, Alberto and Strinzel, Michaela and Jorstad, Anne and Milzow, Katrin and Egger, Matthias and M{\"u}ller, Stefan}, journal={arXiv preprint arXiv:2411.16662}, year={2024} } ``` **APA:** Okasa, G., de León, A., Strinzel, M., Jorstad, A., Milzow, K., Egger, M., & Müller, S. (2024). A Supervised Machine Learning Approach for Assessing Grant Peer Review Reports. arXiv preprint arXiv:2411.16662. ## Model Card Authors [Gabriel Okasa](https://orcid.org/0000-0002-3573-7227), [Alberto de León](https://orcid.org/0009-0002-0401-2618), [Michaela Strinzel](https://orcid.org/0000-0003-3181-0623), [Anne Jorstad](https://orcid.org/0000-0002-6438-1979), [Katrin Milzow](https://orcid.org/0009-0002-8959-2534), [Matthias Egger](https://orcid.org/0000-0001-7462-5132), and [Stefan Müller](https://orcid.org/0000-0002-6315-4125) ## Model Card Contact gabriel.okasa@snf.ch