YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
This is to deplicate the work of
wav2vec2-base-Speech_Emotion_Recognition
Only little changes are made for success run on google colab.
My Version of metrics:
Epoch | Training Loss | Validation Loss | Accuracy | Weighted f1 | Micro f1 | Macro f1 | Weighted recall | Micro recall | Macro recall | Weighted precision | Micro precision | Macro precision |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1.789200 | 1.548816 | 0.382590 | 0.287415 | 0.382590 | 0.289045 | 0.382590 | 0.382590 | 0.379768 | 0.473585 | 0.382590 | 0.467116 |
1 | 1.789200 | 1.302810 | 0.529823 | 0.511868 | 0.529823 | 0.511619 | 0.529823 | 0.529823 | 0.523766 | 0.552868 | 0.529823 | 0.560496 |
2 | 1.789200 | 1.029921 | 0.672757 | 0.668108 | 0.672757 | 0.669246 | 0.672757 | 0.672757 | 0.676383 | 0.674857 | 0.672757 | 0.673698 |
3 | 1.789200 | 0.968154 | 0.677055 | 0.671986 | 0.677055 | 0.674074 | 0.677055 | 0.677055 | 0.676891 | 0.701300 | 0.677055 | 0.705734 |
4 | 1.789200 | 0.850912 | 0.717894 | 0.714321 | 0.717894 | 0.716527 | 0.717894 | 0.717894 | 0.722476 | 0.716772 | 0.717894 | 0.716698 |
5 | 1.789200 | 0.870916 | 0.710371 | 0.706013 | 0.710371 | 0.708563 | 0.710371 | 0.710371 | 0.713853 | 0.710966 | 0.710371 | 0.712245 |
6 | 1.789200 | 0.827148 | 0.729178 | 0.725336 | 0.729178 | 0.726744 | 0.729178 | 0.729178 | 0.732127 | 0.735935 | 0.729178 | 0.736041 |
7 | 1.789200 | 0.798354 | 0.729715 | 0.727086 | 0.729715 | 0.728847 | 0.729715 | 0.729715 | 0.732476 | 0.729932 | 0.729715 | 0.730688 |
8 | 1.789200 | 0.799373 | 0.735626 | 0.732981 | 0.735626 | 0.735058 | 0.735626 | 0.735626 | 0.738147 | 0.741482 | 0.735626 | 0.742782 |
9 | 1.789200 | 0.810692 | 0.728103 | 0.724754 | 0.728103 | 0.726852 | 0.728103 | 0.728103 | 0.731083 | 0.731919 | 0.728103 | 0.732869 |
Num examples = 1861 Batch size = 32 [59/59 08:38]
{'eval_loss': 0.8106924891471863,
'eval_accuracy': 0.7281031703385277,
'eval_Weighted F1': 0.7247543780750472,
'eval_Micro F1': 0.7281031703385277,
'eval_Macro F1': 0.7268519957485492,
'eval_Weighted Recall': 0.7281031703385277,
'eval_Micro Recall': 0.7281031703385277,
'eval_Macro Recall': 0.7310833557439055,
'eval_Weighted Precision': 0.7319188411210771,
'eval_Micro Precision': 0.7281031703385277,
'eval_Macro Precision': 0.732869407033253,
'eval_runtime': 83.3066,
'eval_samples_per_second': 22.339,
'eval_steps_per_second': 0.708,
'epoch': 9.98}
Model description
This model predicts the emotion of the person speaking in the audio sample.
For more information on how it was created, check out the following link: https://github.com/DunnBC22/Vision_Audio_and_Multimodal_Projects/tree/main/Audio-Projects/Emotion%20Detection/Speech%20Emotion%20Detection
Training and evaluation data
Dataset Source: https://www.kaggle.com/datasets/dmitrybabko/speech-emotion-recognition-en
- Downloads last month
- 163
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.