|
--- |
|
datasets: |
|
- coscan-speech2 |
|
license: cc0-1.0 |
|
metrics: |
|
- accuracy |
|
- f1 |
|
- precision |
|
- recall |
|
model-index: |
|
- name: wav2vec2-large-voxrex-swedish-coscan-no-region |
|
results: |
|
- dataset: |
|
name: Coscan Speech |
|
type: NbAiLab/coscan-speech2 |
|
metrics: |
|
- name: Test Accuracy on Coscan Speech |
|
type: accuracy |
|
value: 0.6155107552811807 |
|
- name: Validation Accuracy on Coscan Speech |
|
type: accuracy |
|
value: 0.8773432861141742 |
|
- name: Test F1 (micro) on Coscan Speech |
|
type: f1 |
|
value: 0.6155107552811807 |
|
- name: Validation F1 (micro) on Coscan Speech |
|
type: f1 |
|
value: 0.8773432861141742 |
|
task: |
|
name: Audio Classification |
|
type: audio-classification |
|
tags: |
|
- generated_from_trainer |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# wav2vec2-large-voxrex-swedish-coscan-no-region |
|
|
|
This model is a fine-tuned version of [KBLab/wav2vec2-large-voxrex-swedish](https://huggingface.co/KBLab/wav2vec2-large-voxrex-swedish) on the coscan-speech2 dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 1.0151 |
|
- Accuracy: 0.8773 |
|
- F1: 0.8773 |
|
- Precision: 0.8773 |
|
- Recall: 0.8773 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 3e-05 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_ratio: 0.1 |
|
- num_epochs: 5 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | |
|
|:-------------:|:-----:|:-----:|:---------------:|:--------:|:------:|:---------:|:------:| |
|
| 0.1651 | 1.0 | 6468 | 0.5657 | 0.8650 | 0.8650 | 0.8650 | 0.8650 | |
|
| 0.1217 | 2.0 | 12936 | 0.9411 | 0.8487 | 0.8487 | 0.8487 | 0.8487 | |
|
| 0.0013 | 3.0 | 19404 | 0.9991 | 0.8617 | 0.8617 | 0.8617 | 0.8617 | |
|
| 0.0652 | 4.0 | 25872 | 1.0151 | 0.8773 | 0.8773 | 0.8773 | 0.8773 | |
|
| 0.0001 | 5.0 | 32340 | 1.1031 | 0.8700 | 0.8700 | 0.8700 | 0.8700 | |
|
|
|
|
|
### Classification report on Coscan Speech (test set) |
|
|
|
``` |
|
precision recall f1-score support |
|
|
|
Bergen og Ytre Vestland 0.65 0.97 0.78 1809 |
|
Hedmark og Oppland 0.12 0.06 0.08 2302 |
|
Nordland 0.97 0.47 0.63 2195 |
|
Oslo-området 0.78 0.42 0.55 6957 |
|
Sunnmøre 0.94 0.71 0.81 2636 |
|
Sør-Vestlandet 0.96 0.46 0.62 2860 |
|
Sørlandet 0.62 0.81 0.70 2490 |
|
Troms 0.67 1.00 0.80 2867 |
|
Trøndelag 0.52 0.94 0.67 2666 |
|
Voss og omland 0.70 0.71 0.71 2641 |
|
Ytre Oslofjord 0.20 0.49 0.29 1678 |
|
|
|
accuracy 0.62 31101 |
|
macro avg 0.65 0.64 0.60 31101 |
|
weighted avg 0.68 0.62 0.61 31101 |
|
|
|
``` |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.22.0.dev0 |
|
- Pytorch 1.10.1+cu102 |
|
- Datasets 2.4.1.dev0 |
|
- Tokenizers 0.12.1 |