bin

This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1729
Mse: 0.1729

Model description

This is a modernbert model with a regression head designed to predict the Content score of a summary.

The input should be the summary + [sep] + source.

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("wesleymorris/modernbert-content", num_labels=1)
tokenizer = AutoTokenizer.from_pretrained("wesleymorris/modernbert-content")

def get_score(summary: str,
              source: str):
    text = summary+tokenizer.sep_token+source
    inputs = tokenizer(text, return_tensors = 'pt')
    return float(model(**inputs).logits[0])

Corpus

It was trained on a corpus of 4,233 summaries of 101 sources compiled by Botarleanu et al. (2022). The summaries were graded by expert raters on 6 criteria: Details, Main Point, Cohesion, Paraphrasing, Objective Language, and Language Beyond the Text. A principle component analyis was used to reduce the dimensionality of the outcome variables to two.

Content includes Details, Main Point, Paraphrasing and Cohesion

Contact

This model was developed by LEAR Lab at Vanderbilt University. For questions or comments about this model, please contact [email protected].

Intended uses & limitations

This model can be used to predict human scores of content for a summary. The scores are normalized such that 0 is the mean of the training data and 1 is one standard deviation from the mean.

Training and evaluation data

Before the finetuning step, the model was pretrained on a very large synthetic dataset.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Mse
No log	1.0	411	0.3181	0.3181
0.5319	2.0	822	0.2884	0.2884
0.2343	3.0	1233	0.2395	0.2395
0.1366	4.0	1644	0.1885	0.1885
0.0688	5.0	2055	0.1896	0.1896
0.0688	6.0	2466	0.1854	0.1854
0.0417	7.0	2877	0.1738	0.1738
0.0201	8.0	3288	0.1759	0.1759
0.0086	9.0	3699	0.1800	0.1800
0.0037	10.0	4110	0.1729	0.1729

Framework versions

Transformers 4.48.3
Pytorch 2.6.0+cu124
Datasets 3.2.0
Tokenizers 0.21.0

wesleymorris
/

modernbert-content

bin