---
license: mit
tags:
- generated_from_trainer
datasets: Amir13/conll2003-persian
metrics:
- precision
- recall
- f1
- accuracy
model-index:
- name: xlm-roberta-base-conll2003
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# xlm-roberta-base-conll2003

This model is a fine-tuned version of [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) on the [conll2003-persian](https://huggingface.co/datasets/Amir13/conll2003-persian
) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1579
- Precision: 0.8794
- Recall: 0.8745
- F1: 0.8769
- Accuracy: 0.9758

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 15

### Training results

| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|:--------:|
| No log        | 1.0   | 430  | 0.1374          | 0.8043    | 0.7966 | 0.8004 | 0.9613   |
| 0.2862        | 2.0   | 860  | 0.1093          | 0.8384    | 0.8482 | 0.8433 | 0.9695   |
| 0.1043        | 3.0   | 1290 | 0.1121          | 0.8448    | 0.8556 | 0.8502 | 0.9708   |
| 0.0689        | 4.0   | 1720 | 0.1094          | 0.8635    | 0.8650 | 0.8643 | 0.9737   |
| 0.0473        | 5.0   | 2150 | 0.1225          | 0.8665    | 0.8625 | 0.8645 | 0.9736   |
| 0.0342        | 6.0   | 2580 | 0.1186          | 0.8722    | 0.8730 | 0.8726 | 0.9745   |
| 0.0245        | 7.0   | 3010 | 0.1292          | 0.8802    | 0.8717 | 0.8759 | 0.9755   |
| 0.0245        | 8.0   | 3440 | 0.1309          | 0.8832    | 0.8689 | 0.8760 | 0.9749   |
| 0.0177        | 9.0   | 3870 | 0.1388          | 0.8712    | 0.8717 | 0.8715 | 0.9743   |
| 0.0135        | 10.0  | 4300 | 0.1466          | 0.8699    | 0.8728 | 0.8714 | 0.9752   |
| 0.0103        | 11.0  | 4730 | 0.1486          | 0.8716    | 0.8747 | 0.8731 | 0.9756   |
| 0.0081        | 12.0  | 5160 | 0.1521          | 0.8789    | 0.8736 | 0.8762 | 0.9759   |
| 0.007         | 13.0  | 5590 | 0.1546          | 0.8804    | 0.8734 | 0.8769 | 0.9756   |
| 0.0053        | 14.0  | 6020 | 0.1552          | 0.8750    | 0.8732 | 0.8741 | 0.9756   |
| 0.0053        | 15.0  | 6450 | 0.1579          | 0.8794    | 0.8745 | 0.8769 | 0.9758   |


### Framework versions

- Transformers 4.27.0.dev0
- Pytorch 1.13.1+cu116
- Datasets 2.8.0
- Tokenizers 0.13.2

### Citation
If you used the datasets and models in this repository, please cite it.

```bibtex
@misc{https://doi.org/10.48550/arxiv.2302.09611,
  doi = {10.48550/ARXIV.2302.09611},
  url = {https://arxiv.org/abs/2302.09611},
  author = {Sartipi, Amir and Fatemi, Afsaneh},
  keywords = {Computation and Language (cs.CL), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Exploring the Potential of Machine Translation for Generating Named Entity Datasets: A Case Study between Persian and English},
  publisher = {arXiv},
  year = {2023},
  copyright = {arXiv.org perpetual, non-exclusive license}
}
```