In this post, you will learn about how to train a model using machine learning algorithm such as Logistic Regression.
Here is the code we can use for fitting a model using Logistic Regression. We will use IRIS data set for training the model.
Loading SkLearn Modules / Classes
First and foremost, we will load the appropriate packages, sklearn modules and classes.
# Importing basic packages # import numpy as np import pandas as pd import matplotlib.pyplot as plt # Importing Sklearn module and classes from sklearn.linear_model import LogisticRegression from sklearn.preprocessing import StandardScaler from sklearn import metrics from sklearn import datasets from sklearn.model_selection import train_test_split
As a next step, we will load the dataset and do the data preparation.
iris = datasets.load_iris() X = iris.data[:, [0, 2]] Y = iris.target
Create Training / Test Data
Next step is to create a train and test split. Note the stratification parameter. This is used to ensure that class distribution in training / test split remains consistent / balanced.
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, random_state=1, stratify=Y)
Perform Feature Scaling
Next step is to perform feature scaling in order to make sure features are in fixed range irrespective of their values / units etc.
sc = StandardScaler() sc.fit(X_train) X_train_std = sc.transform(X_train) X_test_std = sc.transform(X_test)
Train a Logistic Regression Model
Next step is to train a logistic regression model. The following needs to be noted while using LogisticRegression algorithm sklearn.linear_model implementation:
- Usage of C parameters. Smaller values of C specify stronger regularization.
- The multi_class parameter is assigned to ‘ovr‘. It represents one-vs-rest algorithm to be used. Other option is multinomial.
- The solver parameter is assigned to ‘lbfsg‘. Other solvers which can be used are newton-cg, sag, saga, lib linear
# Create an instance of LogisticRegression classifier lr = LogisticRegression(C=100.0, random_state=1, solver='lbfgs', multi_class='ovr') # Fit the model # lr.fit(X_train_std, Y_train)
Measure Model Performance
Next step is to measure the model performance of the model trained using LogisticRegression as shown above.
# Create the predictions # Y_predict = lr.predict(X_test_std) # Use metrics.accuracy_score to measure the score print("LogisticRegression Accuracy %.3f" %metrics.accuracy_score(Y_test, Y_predict))