File size: 2,911 Bytes

6ddf685

---
license: mit
base_model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
tags:
- generated_from_trainer
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: mDeBERTa-v3-base-xnli-multilingual-nli-2mil7-energy
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mDeBERTa-v3-base-xnli-multilingual-nli-2mil7-energy

This model is a fine-tuned version of [MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2328
- Accuracy: 0.9637
- Precision: 0.9637
- Recall: 0.9636
- F1: 0.9637
- Ratio: 0.4847

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.06
- lr_scheduler_warmup_steps: 3
- num_epochs: 5
- label_smoothing_factor: 0.01

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1     | Ratio  |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|:------:|
| 0.5212        | 0.43  | 400  | 0.3449          | 0.8948   | 0.8964    | 0.8940 | 0.8945 | 0.4596 |
| 0.4083        | 0.86  | 800  | 0.3203          | 0.9224   | 0.9232    | 0.9218 | 0.9222 | 0.4684 |
| 0.2384        | 1.29  | 1200 | 0.3149          | 0.9361   | 0.9365    | 0.9358 | 0.9360 | 0.4759 |
| 0.213         | 1.72  | 1600 | 0.3024          | 0.9443   | 0.9442    | 0.9442 | 0.9442 | 0.4865 |
| 0.1686        | 2.15  | 2000 | 0.2742          | 0.9493   | 0.6332    | 0.6329 | 0.6330 | 0.4934 |
| 0.105         | 2.58  | 2400 | 0.2641          | 0.9518   | 0.9519    | 0.9522 | 0.9518 | 0.5041 |
| 0.116         | 3.01  | 2800 | 0.2515          | 0.9555   | 0.6374    | 0.6372 | 0.6372 | 0.4997 |
| 0.077         | 3.44  | 3200 | 0.2511          | 0.9580   | 0.9580    | 0.9583 | 0.9580 | 0.4966 |
| 0.0622        | 3.86  | 3600 | 0.2355          | 0.9643   | 0.9644    | 0.9642 | 0.9643 | 0.4828 |
| 0.0524        | 4.29  | 4000 | 0.2289          | 0.9637   | 0.9636    | 0.9637 | 0.9637 | 0.4884 |
| 0.0498        | 4.72  | 4400 | 0.2336          | 0.9643   | 0.9644    | 0.9642 | 0.9643 | 0.4840 |


### Framework versions

- Transformers 4.31.0
- Pytorch 2.1.0+cu121
- Datasets 2.14.7
- Tokenizers 0.13.3