Python K-means Music Genre Classification

Python K-means Music Genre Classification

Music Genre Classification

Music Genre Classification : Automatically classify different musical genres. 

In this tutorial we are going to learn how to automatically classify different musical genres from audio files. We will classify these audio files using their low-level features of frequency and time domain.

For this project we need a dataset of audio tracks having similar size and similar frequency range.

GTZAN genre classification dataset is the most recommended dataset for the music genre classification project and it was collected for this task only.

Music Genre Classification

About the dataset:

The GTZAN genre collection dataset was collected in 2000-2001. It consists of 1000 audio files

each having 30 seconds duration. There are 10 classes ( 10 music genres) each containing 100 audio tracks. Each track is in .wav format. It contains audio files of the following 10 genres:











Music Genre Classification approach:

There are various methods to perform classification on this dataset. Some of these approaches are:

     1.Multiclass support vector machines

     2.K-means clustering

     3.K-nearest neighbors

     4.Convolutional neural networks

We will use K-nearest neighbors algorithm because in various researches it has shown the best results for this problem.

K-Nearest Neighbors is a popular machine learning algorithm for regression and classification. It makes predictions on data points based on their similarity measures i.e distance between them.

predictions on data points based on their similarity measures i.e distance between them.

Feature Extraction:

The first step for music genre classification project would be to extract features and components

from the audio files. It includes identifying the linguistic content and discarding noise.

Mel Frequency Cepstral Coefficients:

These are state-of-the-art features used in automatic speech and speech recognition

studies. There are a set of steps for generation of these features:

     1. Since the audio signals are constantly changing, first we divide these signals into smaller frames. Each frame is around 20-40 ms long

     2. Then we try to identify different frequencies present in each frame

     3. Now, separate linguistic frequencies from the noise

     4. To discard the noise, it then takes discrete cosine transform (DCT) of these frequencies. Using DCT we keep only a specific sequence of frequencies that have a high probability of information.

Steps to build Music Genre Classification:

Download the GTZAN dataset 

Create a new python file  "" and paste the code described in the steps below:

1. Imports:

1. from python_speech_features import mfcc
2. import as wav
3. import numpy as np
5.  from tempfile import TemporaryFile
6. import os
7. import pickle
8. import random
9. import operator

11. import math

12. import numpy as np

2. Define a function to get the distance between feature vectors and find neighbors:

1. def getNeighbors(trainingSet,instance,  k):
2. distances = []
3. for x in range (len(trainingSet)):
4. dist = distance(trainingSet[x],instance k )+ distance(instance,trainingSettrainingSet[x], k)
5. distances.append((trainingSet[x][2],dist))
6. distances.sort(key=operator.itemgetter(1))
7. neighbors = []
8. for x in range(k):

9.  neighbors.append(distances[x][0])

10. return neighbors

3. Identify the nearest neighbors:
1.def nearestClass(neighbors):
2. classVote = {}

4. for x in range(len(neighbors)):

5. response = neighbors[x]
6. if response in classVote:
7. classVote[response]+=1
8. else:
9. classVote[response]=1

11.  sorter = sorted(classVote.items(),key = operator.key = operator.itemgetter(1),reverse=True)

12. return sorter[0][0]

4. Define a function for model evaluation:

1. def getAccuracy(testSet, predictions):

2. correct = 0

3. for x in range (len(testSet)):
4. if testSet[x][-1]==predictions[x]:
5. correct+=1
6. return 1.0*correct/len(testSet)

5. Extract features from the dataset and dump these features into a binary .dat file

"my.dat": = "__path_to_dataset__"

2.f= open("my.dat" ,'wb')
5.for folder in os.listdir(directory):
7.if i==11 :
9.for file in os.listdir(directory+folder):
11.mfcc_feat = mfcc(sig,rate,winlen=0.020, appendEnergy = False)
12.covariance = np.cov(np.matrix.transpose(mfcc_feat))
13.mean_matrix = mfcc_feat.mean(0)
14.feature = (mean_matrix, covariance , i)
15.pickle.dump(feature, f)



6. Train and test split on the dataset:

1.dataset = []
2.def loadDataset(filename, split , trSet , teSet):
3.with open("my.dat" , 'rb') as f:

4. while True:

7.except EOFError:

11.for x in range(len(dataset)):

12.if random.random()<split :

17.trainingSet = []

18.testSet = []
19.loadDataset("my.dat" ,0.66, trainingSet, testSet)

7. Make prediction using KNN and get the accuracy on test data:

1.leng = len(testSet)
2.predictions = []
3.for x in range (leng):
4.predictions.append(nearestClass(getNeighbors(trainingSet,testSet[x] , 5)))

6.accuracy1 = getAccuracy(testSet, predictions)


Test the classifier with new audio file

Save the new audio file in the present directory. Make a new file and paste the above script.

Now,run this script to get the prediction:



In this music genre classification project, we have developed a classifier on audio files to

predict its genre. We work through this project on GTZAN music genre classification dataset. This tutorial explains how to extract importantfeatures from audio files. In this tutorial we have implemented a K nearestneighbor using a count of K as 5


Happy Pythoning...!!

More Articles of Aditi Kothiyal:

Name Views Likes
Python AdaBoost Mathematics Behind AdaBoost 421 1
Python PyCaret How to optimize the probability threshold % in binary classification 2071 0
Python K-means Predicting Iris Flower Species 1323 2
Python PyCaret How to ignore certain columns for model building 2636 0
Python PyCaret Experiment Logging 680 0
Python PyWin32 Open a File in Excel 941 0
Python Guppy GSL Introduction 220 2
Python Usage of Guppy With Example 1102 2
Python Naive Bayes Tutorial 553 2
Python Guppy Recent Memory Usage of a Program 893 2
Introduction to AdaBoost 290 1
Python AdaBoost Implementation of AdaBoost 513 1
Python AdaBoost Advantages and Disadvantages of AdaBoost 3715 1
Python K-Means Clustering Applications 333 2
Python Random Forest Algorithm Decision Trees 440 0
Python K-means Clustering PREDICTING IRIS FLOWER SPECIES 457 1
Python Random Forest Algorithm Bootstrap 476 0
Python PyCaret Util Functions 441 0
Python K-means Music Genre Classification 1764 1
Python PyWin Attach an Excel file to Outlook 1542 0
Python Guppy GSL Document and Test Example 248 2
Python Random Forest Algorithm Bagging 387 0
Python AdaBoost An Example of How AdaBoost Works 280 1
Python PyWin32 Getting Started PyWin32 603 0
Python Naive Bayes in Machine Learning 376 2
Python PyCaret How to improve results from hyperparameter tuning by increasing "n_iter" 1724 0
Python PyCaret Getting Started with PyCaret 2.0 357 1
Python PyCaret Tune Model 1326 1
Python PyCaret Create your own AutoML software 321 0
Python PyCaret Intoduction to PyCaret 297 1
Python PyCaret Compare Models 2697 1
Python PyWin Copying Data into Excel 1154 0
Python Guppy Error: expected function body after function declarator 414 2
Python Coding Random forest classifier using xgBoost 247 0
Python PyCaret How to tune "n parameter" in unsupervised experiments 659 0
Python PyCaret How to programmatically define data types in the setup function 1403 0
Python PyCaret Ensemble Model 806 1
Python Random forest algorithm Introduction 229 0
Python k-means Clustering Example 340 1
Python PyCaret Plot Model 1245 1
Python Hamming Distance 715 0
Python Understanding Random forest algorithm 311 0
Python PyCaret Sort a Dictionary by Keys 245 0
Python Coding Random forest classifier using sklearn 341 0
Python Guppy Introduction 368 2
Python How to use Guppy/Heapy for tracking down Memory Usage 1069 2
Python AdaBoost Summary and Conclusion 232 1
Python PyCaret Create Model 366 1
Python k -means Clusturing Introduction 326 2
Python k-means Clustering With Example 351 2