### C++ MLPACK :: LogisticRegression

• In the previous article in this series of MLpack with C++, we dealt with the algorithm named as LinearRegression.
• Here we will be seeing another important Machine-Learning algorithm named as LogisticRegression in MLpack.

Introduction :

• Logistic Regression is the regression analysis to conduct when the dependent variable is binary, i.e., can belong to one class or the other.

• Following are the two major questions (applications) that can be answered by Logistic Regression :
• How does the probability of getting lung cancer (yes vs. no) change for every additional pound a person is overweight and for every pack of cigarettes smoked per day?
• Do body weight, calorie intake, fat intake, and age have an influence on the probability of having a heart attack (yes vs. no)?

• The output value of Logistic-Regression-Function is of binary type (0 or 1), showing to which class, the given point belongs.
• Just like Linear regression assumes that the data follows a linear function, Logistic regression models the data using the sigmoid function.
• This sigmoid function is the one which takes all the output generated in the range [0, 1].

• Following is the chart-representation and mathematical formula for Sigmoid Function :

• In MLpack, LogisticRegression class implements an L2-regularized logistic regression model, and supports training with multiple optimizers and classification.
• The class supports different observation types via the MatType template parameter; for instance, logistic regression can be performed on sparse datasets by specifying arma::sp_mat as the MatType parameter.

• LogisticRegression can be used for general classification tasks, but the class is restricted to support only two classes. For multiclass logistic regression, MLpack provides a separate class called SoftmaxRegression (which we will see in some other article).

Model Description :

• The LogisticRegression is rather generally used as a classification technique as it outputs in form of binary variables.
• The model actually outputs in a continuous value, the same manner as the LinearRegression does, but using the Sigmoid Function, this continuous output is converted into discrete one.

Class Description :

• The class allows loading a logistic regression model (via the --input_model_file parameter) or training a logistic regression model, given training data (specified with the --training_file parameter), or both those things at once.
• In addition, this program allows classification on a test dataset (specified with the --test_file parameter) and the classification results may be saved with the --predictions_file output parameter.
• The trained logistic regression model may be saved using the --output_model_file output parameter.
• We can furthermore provide a lambda parameter (L2 regularization), which helps the model not to do over-fitting. This can be specified with the --lambda option.

• As an example, to train a logistic regression model on the data 'data.csv' with labels 'labels.csv' with L2 regularization of 0.1, saving the model to 'lr_model.bin', the following command may be used:
`\$ mlpack_logistic_regression --training_file data.csv --labels_file labels.csv --lambda 0.1 --output_model_file lr_model.bin`

• Then, to use that model to predict classes for the dataset 'test.csv', storing the output predictions in 'predictions.csv', the following command may be used:
`\$ mlpack_logistic_regression --input_model_file lr_model.bin --test_file test.csv --output_file predictions.csv`

Major Function Description :

• The class provides you with four constructors, the most widely used one is given below,
 LogisticRegression ( const MatType & predictors, const arma::Row< size_t > & responses, const double lambda = `0` )

The parameter description for the constructor is as follows :
 predictors Input training variables. responses Outputs resulting from input training variables. lambda L2-regularization parameter.

• Next, comes the function to classify the provided new data-points (test dataset). Following is most commonly used template for this,
 void Classify ( const MatType & dataset, arma::Row< size_t > & labels, const double decisionBoundary = `0.5` ) const

Parameters
 dataset Set of points to classify. labels Predicted labels for each point. decisionBoundary Decision boundary (default 0.5).

• For calculating the accuracy of the model, we need to compare the actual outputs of data-points with the outputs that the model predicts. For this purpose, the class gives us a function called
 double ComputeAccuracy ( const MatType & predictors, const arma::Row< size_t > & responses, const double decisionBoundary = `0.5` )

Parameters
 predictors Input predictors. responses Vector of responses. decisionBoundary Decision boundary (default 0.5).

• The accuracy is returned as a percentage, between 0 and 100. It returns the percentage of responses that are correctly predicted.

• The required header files are as follows :
`#include <mlpack/core.hpp>#include <mlpack/methods/logistic_regression/logistic_regression.hpp>#include <ensmallen.hpp>#include <boost/test/unit_test.hpp>#include "test_tools.hpp"using namespace mlpack;using namespace mlpack::regression;using namespace mlpack::distribution;`

Stay connected for more Articles ...
Like the post if you find it worthy ...

#### More Articles of Keshav Kabra:

Name Views Likes
C++ MLPACK :: CharExtract 560 4
Star Pattern - 3 527 5
Things a Beginner Programmer MUST Do 635 12
Ramanujan Numbers 1416 2
Star Pattern - 4 540 8
Implementation of Stack using Singly Linked-List 2249 10
C++ MLPACK :: Recall 500 7
C++ MLPACK Introduction 652 13
Star Pattern - 1 638 13
C++ MLPACK :: ZCAWhitening 547 4
Factorial using Stack 5821 5
C++ MLPACK :: Installation 1611 7
C++ MLPACK :: DatasetMapper 505 6
C++ MLPACK :: AdaBoost 982 10
C++ MLPACK :: ImageInfo 442 3
C++ MLPACK :: F1-Score (F1) 1032 7
C++ MLPACK :: MedianImputation 472 5
C++ MLPACK :: Split 838 7
C++ MLPACK :: Clustering 789 4
C++ MLPACK :: MeanNormalization 548 6
C++ MLPACK :: SimpleCV 588 8
Postfix Expression Evaluation by Stack 4755 14
C++ Program to Identify People Invited in a Party 956 3
C++ MLPACK :: ListwiseDeletion 511 5
C++ MLPACK :: ColumnsToBlock 361 2
C++ MLPACK :: Confusion Matrix 1376 7
C++ MLPACK :: Binarize 555 5
Print Statement Without Using Semi-Colon 434 4
C++ MLPACK :: MinMaxScaler 1061 5
C++ MLPACK :: MeanShift 984 3
C++ MLPACK :: MSE (Mean Squared Error) 1731 6
C++ MLPACK :: Data-Normalization 728 3
Mistakes People do While Learning Programming 533 12
Shift 3 Numbers without using Extra Memory 460 7
C++ MLPACK :: Perceptron 1119 2
C++ MLPACK :: LinearSVM 916 11
Left-Shift and Right-Shift Operators 503 5
Circular Queue using Array 2386 8
C++ MLPACK :: LinearRegression 1950 15
C++ MLPACK :: NeighborSearch 632 14
C++ MLPACK :: StandardScaler 1048 4
Sorting a Singly Linked-List 506 2
Star Pattern - 7 517 5
C++ MLPACK :: Precision 535 8
C++ MLPACK :: MaxAbsScaler 556 4
C++ MLPACK :: KMeans 1656 4
C++ MLPACK :: NaiveBayesClassifier 1181 4
C++ MLPACK :: NaiveKMeans 420 3
C++ MLPACK :: MeanImputation 424 5
C++ MLPACK :: DBSCAN 2056 3
C++ MLPACK :: Accuracy 479 8
C++ MLPACK :: DecisionStump 479 2
C++ MLPACK :: Ensemble Learning 708 9
C++ MLPACK :: EMST :: DTBRules 375 3
C++ MLPACK :: RangeType 411 2
Reverse a Singly Linked-List 515 6
Star Pattern - 5 (Pascal Triangle) 734 3
C++ MLPACK :: Imputer 567 6
Move Text on Pressing Keys 607 4
C++ MLPACK :: PCAWhitening 508 3
Star Pattern - 2 486 8
C++ MLPACK :: EMST :: EdgePair 432 3
C++ MLPACK :: Over-Fitting and Under-Fitting 692 7
Pythagorean Triplets 3414 3