GIZ
/

TAPP-multilabel-bge

This model is a fine-tuned version of BAAI/bge-base-en-v1.5 on the Policy-Classification dataset.

The loss function BCEWithLogitsLoss is modified with pos_weight to focus on recall, therefore instead of loss the evaluation metrics are used to assess the model performance during training It achieves the following results on the evaluation set:

  • Precision-micro: 0.7772
  • Precision-samples: 0.7644
  • Precision-weighted: 0.7756
  • Recall-micro: 0.8329
  • Recall-samples: 0.7920
  • Recall-weighted: 0.8329
  • F1-micro: 0.8041
  • F1-samples: 0.7609
  • F1-weighted: 0.8029

Model description

The purpose of this model is to predict multiple labels simultaneously from a given input data. Specifically, the model will predict four labels - ActionLabel, PlansLabel, PolicyLabel, and TargetLabel - that are relevant to a particular task or application

  • Target: Targets are an intention to achieve a specific result, for example, to reduce GHG emissions to a specific level (a GHG target) or increase energy efficiency or renewable energy to a specific level (a non-GHG target), typically by
    a certain date.
  • Action: Actions are an intention to implement specific means of achieving GHG reductions, usually in forms of concrete projects.
  • Policies: Policies are domestic planning documents such as policies, regulations or guidlines.
  • Plans:Plans are broader than specific policies or actions, such as a general intention to ‘improve efficiency’, ‘develop renewable energy’, etc.

The terms come from the World Bank's NDC platform and WRI's publication

Intended uses & limitations

More information needed

Training and evaluation data

  • Training Dataset: 10031

    Class Positive Count of Class
    Action 5416
    Plans 2140
    Policy 1396
    Target 2911
  • Validation Dataset: 932

    Class Positive Count of Class
    Action 513
    Plans 198
    Policy 122
    Target 256

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 7.4e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 7

Training results

Training Loss Epoch Step Validation Loss Precision-micro Precision-samples Precision-weighted Recall-micro Recall-samples Recall-weighted F1-micro F1-samples F1-weighted
0.7161 1.0 627 0.6322 0.5931 0.6373 0.6274 0.8219 0.7833 0.8219 0.6890 0.6728 0.7000
0.4549 2.0 1254 0.5420 0.6639 0.6891 0.7049 0.8090 0.7684 0.8090 0.7293 0.7048 0.7409
0.2599 3.0 1881 0.6966 0.7354 0.7396 0.7346 0.8219 0.7845 0.8219 0.7762 0.7425 0.7713
0.1405 4.0 2508 0.7530 0.7569 0.7494 0.7569 0.8292 0.7899 0.8292 0.7914 0.7505 0.7905
0.0681 5.0 3135 0.8234 0.7596 0.7535 0.7599 0.8356 0.7945 0.8356 0.7958 0.7546 0.7953
0.0291 6.0 3762 0.8849 0.7773 0.7640 0.7776 0.8301 0.7890 0.8301 0.8028 0.7597 0.8027
0.0147 7.0 4389 0.9217 0.7772 0.7644 0.7756 0.8329 0.7920 0.8329 0.8041 0.7609 0.8029
label precision recall f1-score support
Action 0.826 0.883 0.853 513.0
Plans 0.653 0.646 0.649 198.0
Policy 0.726 0.803 0.762 122.0
Target 0.791 0.890 0.838 256.0

Environmental Impact

Carbon emissions were measured using CodeCarbon.

  • Carbon Emitted: 0.07145 kg of CO2
  • Hours Used: 1.36 hours

Training Hardware

  • On Cloud: yes
  • GPU Model: 1 x Tesla T4
  • CPU Model: Intel(R) Xeon(R) CPU @ 2.30GHz
  • RAM Size: 12.67 GB

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
17
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for GIZ/TAPP-multilabel-bge_f

Finetuned
(325)
this model

Dataset used to train GIZ/TAPP-multilabel-bge_f

Collection including GIZ/TAPP-multilabel-bge_f