Spaces:
Runtime error
Runtime error
File size: 1,542 Bytes
ac1ef50 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
During first lesson of Practical Deep Learning for Coders course, Jeremy had mentioned how using simple computer vision model by being a bit creative we can build a state of the art model to classify audio with same image classification model. I was curious on how I can train an music classifier, as I have never worked on audio data before.
[You can find how I trained this music genre classification using fast.ai](https://kurianbenoy.com/ml-blog/fastai/fastbook/2022/05/01/AudioCNNDemo.html).
## Dataset
1. [The competition data](https://www.kaggle.com/competitions/kaggle-pog-series-s01e02/data)
2. [Image data generated from converting audio to melspectograms in form of images](https://www.kaggle.com/datasets/dienhoa/music-genre-spectrogram-pogchamps)
## Training
Fast.ai was used to train this classifier with a ResNet50 vision learner for 10 epochs.
epoch train_loss valid_loss error_rate time
0 2.869285 2.171426 0.616428 01:43
epoch train_loss valid_loss error_rate time
0 2.312176 1.843815 0.558654 02:07
1 2.102361 1.719162 0.539061 02:08
2 1.867139 1.623988 0.527003 02:08
3 1.710557 1.527913 0.507661 02:07
4 1.629478 1.456836 0.479779 02:05
5 1.519305 1.433036 0.474253 02:05
6 1.457465 1.379757 0.464456 02:05
7 1.396283 1.369344 0.457925 02:05
8 1.359388 1.367973 0.453655 02:05
9 1.364363 1.368887 0.456167 02:04
## Examples
The example images provided in the demo are from the validation data from Kaggle competition data, which was not used during training.
|