File size: 1,542 Bytes
ac1ef50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
During first lesson of Practical Deep Learning for Coders course, Jeremy had mentioned how using simple computer vision model by being a bit creative we can build a state of the art model to classify audio with same image classification model. I was curious on how I can train an music classifier, as I have never worked on audio data before.


[You can find how I trained this music genre classification using fast.ai](https://kurianbenoy.com/ml-blog/fastai/fastbook/2022/05/01/AudioCNNDemo.html).

## Dataset

1. [The competition data](https://www.kaggle.com/competitions/kaggle-pog-series-s01e02/data)
2. [Image data generated from converting audio to melspectograms in form of images](https://www.kaggle.com/datasets/dienhoa/music-genre-spectrogram-pogchamps)


## Training

Fast.ai was used to train this classifier with a ResNet50 vision learner for 10 epochs.

epoch 	train_loss 	valid_loss 	error_rate 	time
0 	2.869285 	2.171426 	0.616428 	01:43
epoch 	train_loss 	valid_loss 	error_rate 	time
0 	2.312176 	1.843815 	0.558654 	02:07
1 	2.102361 	1.719162 	0.539061 	02:08
2 	1.867139 	1.623988 	0.527003 	02:08
3 	1.710557 	1.527913 	0.507661 	02:07
4 	1.629478 	1.456836 	0.479779 	02:05
5 	1.519305 	1.433036 	0.474253 	02:05
6 	1.457465 	1.379757 	0.464456 	02:05
7 	1.396283 	1.369344 	0.457925 	02:05
8 	1.359388 	1.367973 	0.453655 	02:05
9 	1.364363 	1.368887 	0.456167 	02:04

## Examples

The example images provided in the demo are from the validation data from Kaggle competition data, which was not used during training.