News Category Classification for IPTC NewsCodes

This model is a fine-tuned version of KB/bert-base-swedish-cased on a private dataset.

Built from a limited set of English, Swedish and Norwegian titles to classify news content within 16 categories as specified by the IPTC NewsCodes.

The model has been fine-tuned on a dataset that is greatly skewed, but has been slightly augmented to stabilize it.

Model description

The model is intended to categorize Norwegian, Swedish and English news content within the specified 16 categories but is a test model for demonstration purposes. It needs more data within several categories to provide 100% value but it will outperform Claude Haiku and GPT-3.5 on this use case.

Intended uses & limitations

Use it to categorize news texts. Only set the category if the value is at least 60% for the label, otherwise the model is uncertain.

Test examples

Input: Mann siktet for drapsforsøk på Slovakias statsministeren

Output: politics

Input: Tre døde i kioskbrann i Tyskland

Output: disaster, accident, and emergency incident

Input: Kultfilm får Netflix-oppfølger. Kultfilmen «Happy Gilmore» fra 1996 får en oppfølger på Netflix. Det røper strømmetjenesten selv på X, tidligere Twitter. –Happy Gilmore er tilbake!

Output: arts, culture, entertainment and media

Performance

It achieves the following results on the evaluation set:

  • Loss: 0.8030
  • Accuracy: 0.7431
  • F1: 0.7474
  • Precision: 0.7695
  • Recall: 0.7431

See the performance (accuracy) for each label below:

  • Arts, culture, entertainment and media: 0.6842
  • Conflict, war and peace: 0.7351
  • Crime, law and justice: 0.8918
  • Disaster, accident, and emergency incident: 0.8699
  • Economy, business, and finance: 0.6893
  • Environment: 0.4483
  • Health: 0.7222
  • Human interest: 0.3182
  • Labour: 0.5
  • Lifestyle and leisure: 0.5556
  • Politics: 0.7909
  • Science and technology: 0.4583
  • Society: 0.3538
  • Sport: 0.9615
  • Weather: 1.0
  • Religion: 0.0

Training and evaluation data

Trained with the trainer, setting a learning rate of 2e-05 and batch size of 16 for 3 epochs.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall Accuracy Label Arts, culture, entertainment and media Accuracy Label Conflict, war and peace Accuracy Label Crime, law and justice Accuracy Label Disaster, accident, and emergency incident Accuracy Label Economy, business, and finance Accuracy Label Environment Accuracy Label Health Accuracy Label Human interest Accuracy Label Labour Accuracy Label Lifestyle and leisure Accuracy Label Politics Accuracy Label Religion Accuracy Label Science and technology Accuracy Label Society Accuracy Label Sport Accuracy Label Weather
1.9761 0.2907 200 1.4046 0.6462 0.6164 0.6057 0.6462 0.3158 0.8315 0.7629 0.7055 0.5437 0.0 0.5 0.0 0.0 0.3333 0.4843 0.0 0.0833 0.0 0.9615 0.0
1.2153 0.5814 400 1.0225 0.6894 0.6868 0.7652 0.6894 0.7895 0.6554 0.8196 0.8562 0.6408 0.2414 0.8333 0.1364 0.0 0.6667 0.8467 0.0 0.375 0.0154 0.9615 1.0
0.954 0.8721 600 0.8858 0.7231 0.7138 0.7309 0.7231 0.7368 0.7795 0.8918 0.8699 0.6214 0.3448 0.8889 0.1818 1.0 0.5556 0.6899 0.0 0.25 0.0462 0.9615 1.0
0.6662 1.1628 800 0.9381 0.6881 0.7009 0.7618 0.6881 0.7895 0.6126 0.8454 0.8630 0.6505 0.4483 0.7222 0.2273 1.0 0.4444 0.8293 0.0 0.5417 0.2308 0.9615 1.0
0.5554 1.4535 1000 0.8791 0.7025 0.7124 0.7628 0.7025 0.7368 0.6478 0.9021 0.8562 0.6602 0.3103 0.7778 0.3636 0.5 0.5556 0.8084 0.0 0.5 0.1846 0.9615 1.0
0.4396 1.7442 1200 0.8275 0.7175 0.7280 0.7686 0.7175 0.7895 0.6631 0.8196 0.8836 0.6893 0.3793 0.8333 0.4091 0.5 0.5556 0.8362 0.0 0.4167 0.3692 0.9615 1.0
0.383 2.0349 1400 0.7929 0.745 0.7501 0.7653 0.745 0.6842 0.7841 0.8866 0.8767 0.7087 0.4483 0.7778 0.4091 0.5 0.5556 0.6899 0.0 0.4167 0.2923 0.9615 0.0
0.3418 2.3256 1600 0.8042 0.7438 0.7440 0.7686 0.7438 0.7895 0.7351 0.9072 0.8493 0.7864 0.4483 0.7778 0.3182 0.5 0.5556 0.7909 0.0 0.4167 0.1846 0.9615 0.0
0.248 2.6163 1800 0.8387 0.7275 0.7325 0.7610 0.7275 0.6842 0.6891 0.8814 0.8699 0.7573 0.4138 0.8333 0.4091 0.5 0.5556 0.8014 0.0 0.4167 0.2769 0.9615 0.0
0.2525 2.9070 2000 0.8137 0.735 0.7413 0.7697 0.735 0.6842 0.7106 0.8763 0.8699 0.6796 0.4483 0.7222 0.3636 0.5 0.5556 0.8153 0.0 0.4583 0.3385 0.9615 0.0

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
17
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ilsilfverskiold/classify-news-category-iptc

Finetuned
(1)
this model