Python AdaBoost Implementation of AdaBoost














































Python AdaBoost Implementation of AdaBoost




Implementation of AdaBoost Using Python


Step 1: Importing the Modules

As always, the first step in building our model is to import the necessary packages and modules.

In Python we have the AdaBoostClassifier and AdaBoostRegressor classes from the scikit-learn library. For our case we would import AdaBoostClassifier (since our example is a classification task). The train_test_split method is used to split our dataset into training and test sets. We also import datasets, from which we will use the the Iris Dataset.

from sklearn.ensemble import AdaBoostClassifier
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn import metrics

Step 2: Exploring the data

You can use any classification dataset, but here we'll use traditional Iris dataset for a multi-class classification problem. This dataset contains four features about different types of Iris flowers (sepal length, sepal width, petal length, petal width). The target is to predict the type of flower from three possibilities: Setosa, Versicolour, and Virginica. The dataset is available in the scikit-learn library, or you can also download it from the UCI Machine Learning Library.

Next, we make our data ready by loading it from the datasets package using the load_iris() method. We assign the data to the iris variable.

Further, we split our dataset into input variable X, which contains the features sepal length, sepal width, petal length, and petal width.

Y is our target variable, or the class that we have to predict: either Iris Setosa, Iris Versicolour, or Iris Virginica. Below is an example of what our data looks like.

iris = datasets.load_iris()
X = iris.data
y = iris.target
print(X)
print(Y)

Output:

[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5.8 4. 1.2 0.2]
[5.7 4.4 1.5 0.4]
. . . .
. . . .
]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2]

Step 3: Splitting the data

Splitting the dataset into training and testing datasets is a good idea to see if our model is classifying the data points correctly on unseen data.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) 

Here we split our dataset into  70% training and 30% test which is a common scenario.

Step 4: Fitting the Model

Building the AdaBoost Model. AdaBoost takes Decision Tree as its learner model by default. We make an AdaBoostClassifier object and name it abc. Few important parameters of AdaBoost are :

  • base_estimator: It is a weak learner used to train the model.
  • n_estimators: Number of weak learners to train in each iteration.
  • learning_rate: It contributes to the weights of weak learners. It uses 1 as a default value.
abc = AdaBoostClassifier(n_estimators=50,
learning_rate=1)

We then go ahead and fit our object abc to our training dataset. We call it a model.

model = abc.fit(X_train, y_train)

Step 5: Making the Predictions

Our next step would be to see how good or bad our model is to predict our target values.

y_pred = model.predict(X_test)

In this step, we take a sample observation and make a prediction on unseen data. Further, we use the predict() method on the model to check for the class it belongs to.

Step 6: Evaluating the model

The Model accuracy will tell us how many times our model predicts the correct classes.

print("Accuracy:", metrics.accuracy_score(y_test, y_pred))

Output:
Accuracy:0.8666666666666667

You get an accuracy of 86.66% -not bad. You can experiment with various other base learners like Support Vector Machine, Logistic Regression which might give you higher accuracy.

Happy Pythoning


More Articles of Aditi Kothiyal:

Name Views Likes
Python AdaBoost Mathematics Behind AdaBoost 421 1
Python PyCaret How to optimize the probability threshold % in binary classification 2069 0
Python K-means Predicting Iris Flower Species 1322 2
Python PyCaret How to ignore certain columns for model building 2624 0
Python PyCaret Experiment Logging 680 0
Python PyWin32 Open a File in Excel 941 0
Python Guppy GSL Introduction 219 2
Python Usage of Guppy With Example 1101 2
Python Naive Bayes Tutorial 552 2
Python Guppy Recent Memory Usage of a Program 892 2
Introduction to AdaBoost 289 1
Python AdaBoost Implementation of AdaBoost 513 1
Python AdaBoost Advantages and Disadvantages of AdaBoost 3713 1
Python K-Means Clustering Applications 332 2
Python Random Forest Algorithm Decision Trees 439 0
Python K-means Clustering PREDICTING IRIS FLOWER SPECIES 457 1
Python Random Forest Algorithm Bootstrap 476 0
Python PyCaret Util Functions 441 0
Python K-means Music Genre Classification 1763 1
Python PyWin Attach an Excel file to Outlook 1541 0
Python Guppy GSL Document and Test Example 248 2
Python Random Forest Algorithm Bagging 386 0
Python AdaBoost An Example of How AdaBoost Works 279 1
Python PyWin32 Getting Started PyWin32 602 0
Python Naive Bayes in Machine Learning 374 2
Python PyCaret How to improve results from hyperparameter tuning by increasing "n_iter" 1723 0
Python PyCaret Getting Started with PyCaret 2.0 356 1
Python PyCaret Tune Model 1325 1
Python PyCaret Create your own AutoML software 321 0
Python PyCaret Intoduction to PyCaret 296 1
Python PyCaret Compare Models 2696 1
Python PyWin Copying Data into Excel 1153 0
Python Guppy Error: expected function body after function declarator 413 2
Python Coding Random forest classifier using xgBoost 247 0
Python PyCaret How to tune "n parameter" in unsupervised experiments 658 0
Python PyCaret How to programmatically define data types in the setup function 1403 0
Python PyCaret Ensemble Model 805 1
Python Random forest algorithm Introduction 227 0
Python k-means Clustering Example 337 1
Python PyCaret Plot Model 1243 1
Python Hamming Distance 715 0
Python Understanding Random forest algorithm 311 0
Python PyCaret Sort a Dictionary by Keys 244 0
Python Coding Random forest classifier using sklearn 340 0
Python Guppy Introduction 368 2
Python How to use Guppy/Heapy for tracking down Memory Usage 1069 2
Python AdaBoost Summary and Conclusion 232 1
Python PyCaret Create Model 365 1
Python k -means Clusturing Introduction 325 2
Python k-means Clustering With Example 348 2

Comments