
In this post, you will learn about the concepts of Support Vector Machine (SVM) with the help of Python code example for building a machine learning classification model. We will work with Python Sklearn package for building the model. As data scientists, it is important to get a good grasp on SVM algorithm and related aspects.
What is Support Vector Machine (SVM)?
Support vector machine (SVM) is a supervised machine learning algorithm that can be used for both classification and regression tasks. At times, SVM for classification is termed as support vector classification (SVC) and SVM for regression is termed as support vector regression (SVR). In this post, we will learn about SVM classifier. The main idea behind SVM classifier is to find a hyperplane that maximally separates the data points of different classes. In other words, we are looking for the largest margin between the two classes. The rationale behind having decision boundaries with large margins is that they tend to have a lower generalization error, whereas models with small margins are more prone to overfitting. Given labeled training data (supervised learning), the SVM classification algorithm outputs an optimal hyperplane which categorizes new examples into different classes. This hyperplane is then used to make predictions on new data points. Support Vector Machine classifier is also termed as maximum margin classifier, meaning that it finds the line or hyperplane that has the largest distance to the nearest training data points (support vectors) of any class. Let’s take and example to understand Support Vector Machine better. Say you have been asked to predict whether a customer will churn or not and you have all their past transaction records as well as demographic information. After exploring the data, you’ve found that there’s not much difference between the average transaction amount of customers who churned and those who didn’t. You also found that most of the customers who churned live relatively far from the city center. Based on these findings, you decided to use Support Vector Machine classification algorithm to build your prediction model. Model trained using SVM classification algorithm will be able to classify the customers as high risk (churned) or otherwise.
There are some key concepts that are important to understand when working with SVMs. First, the data points that are closest to the hyperplane are called support vectors. These points have a direct impact on the position and orientation of the hyperplane. Second, when working with Python SKlearn SVC algorithm, there are three hyperparameters that results in different SVM models (hypothesis): C, gamma and kernel function.
- C is typically used as a regularization parameter to control overfitting, allowing the algorithm to make more accurate predictions on new data points. The strength of the regularization is inversely proportional to C. Simply speaking, C controls the trade-off between maximizing the margin and minimizing training error. If C is set too low, the model will fail to accurately classify data points due to underfitting and can produce inaccurate results; if C is set too high, then the model will have difficulty correctly separating data points due to overfitting and can show poor accuracy scores.
In order to get the best performance from an SVM algorithm, the value of C should be tuned accordingly so that it yields an optimal balance between underfitting and overfitting. This will enable the model to perform better on both training and test datasets, ensuring that it produces accurate results for all types of data points. - When training a support vector machine (SVM) model using Sklearn SVC algorithm, the gamma hyperparameter can take on two special values: ‘scale’, ‘auto’ or ‘float’. When gamma is set to ‘scale’, it means that the value of gamma is calculated as 1 / (number of features * variance in data). This ensures that all features are given equal importance in the model and produces consistent results no matter how many features are used.
- SVM kernel is a mathematical function that is used to map the data points from one space into another, usually higher dimensional space. When training a support vector machine (SVM) model using Sklearn SVC algorithm, the kernel hyperparameter can take on several values: ‘linear’, ‘poly’, ‘rbf’ and ‘sigmoid’.
- When kernel is set to ‘linear’, it means that the model will use a linear boundary for classification and regression. This is the simplest type of SVM and works best when data are linearly separable.
- When kernel is set to ‘poly’, it means that the model will use polynomial functions of degree higher than 1 for classification. This type of SVM is more suitable for complex non-linear datasets.
- When kernel is set to ‘rbf’, it means that the model will use radial basis funcitons for classification or regression. RBF kernels are capable of dealing with complex multi-class datasets and have good generalization performance with noisy data points.
- When kernel is set to ‘sigmoid’, it means that the model will apply sigmoid functions instead of RBFs for classification or regression tasks. Sigmoid kernels tend to be less sensitive than RBFs with respect to outliers in data but may not generalize as well unless their parameters are tuned properly.
As an example, let’s say we have a dataset with two features (x1 and x2) and two classes (0 and 1). We can visualize this data by plotting it in a two-dimensional space, with each point colored according to its class label. Look at the diagram below.
In the above case, we can see that there are different straight lines that can perfectly separate the two classes. However, we can still find a decision boundary that does a pretty good job. This boundary is generated by Support Vector Machine algorithm. Using SVM algorithm, as mentioned above, training the model represents finding the hyperplane (dashed line in the picture below) which separates the data belonging to two different classes by maximum or largest margin. And, the points closest to this hyperplane are called support vectors. Note this in the diagram given below.
The blue square points represent one class and the red dots represent another class. The black line is the decision boundary learned by an SVM. As you can see, the SVM has placed the boundary in such a way as to maximize the margin between the two classes. Here is a visual representation of how SVM classification works. Note how original data set is projected into higher-dimensional space, then a decision boundary is found and finally the decision boundary is projected into original feature space.
Support vector machines are a powerful tool for classification, but like any machine learning algorithm, they require careful tuning of their hyperparameters in order to achieve optimal performance. The most important hyperparameters are the kernel function and the regularization parameters such as C and gamma. The kernel function determines how data points are transformed into higher dimensional space, and the regularization parameter controls the trade-off between model complexity and overfitting. In addition, the Support Vector Machine also has a number of other important hyperparameters that can be adjusted to improve performance, including the maximum number of iterations, the tolerance for error, and the learning rate. By carefully tuning these hyperparameters, it is possible to achieve significantly better performance from a Support Vector Machine. Here are related post on tuning hyperparameters for building an optimal SVM model for classification:
Support vector machine (SVM) Python example
The following steps will be covered for training the model using SVM while using Python code:
- Load the data
- Create training and test split
- Perform feature scaling
- Instantiate an SVC classifier
- Fit the model
- Measure the model performance
First and foremost we will load appropriate Sklearn modules and classes.
# Basic packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Sklearn modules & classes
from sklearn.linear_model import Perceptron, LogisticRegression
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn import datasets
from sklearn import metrics
Lets get started with loading the data set and creating the training and test split from the data set. Pay attention to the stratification aspect used when creating the training and test split. The train_test_split class of sklearn.model_selection is used for creating training and test split.
# Load the data set; In this example, the breast cancer dataset is loaded.
bc = datasets.load_breast_cancer()
X = bc.data
y = bc.target
# Create training and test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1, stratify=y)
Next step is to perform feature scaling. The reason for doing feature scaling is to make sure that data for different features are in the same range. The StandardScaler class of sklearn.preprocessing is used.
sc = StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)
Next step is to instantiate a SVC (Support Vector Classifier) and fit the model. The SVC class of sklearn.svm module is used.
# Instantiate the Support Vector Classifier (SVC)
svc = SVC(C=1.0, random_state=1, kernel='linear')
# Fit the model
svc.fit(X_train_std, y_train)
Finally, it is time to measure the model performance. Here is the code for doing the same:
# Make the predictions
y_predict = svc.predict(X_test_std)
# Measure the performance
print("Accuracy score %.3f" %metrics.accuracy_score(y_test, y_predict))
The performance of the model will turn out to be 0.953.
Conclusion
In conclusion, Support Vector Machines (SVMs) are powerful learning models that can be used for both classification and regression tasks. SVMs work by mapping data points into higher dimensional feature space, which allows them to capture nonlinear relationships between features and perform complex classifications and regressions. Python provides several libraries for using SVMs efficiently such as scikit-learn and pySVM, making it easy to get started with SVM implementation in Python. By following the example in this guide, you now know more about what SVMs are capable of, how they work, and how to use them in practice.
- K-Fold Cross Validation – Python Example - March 21, 2023
- Positively Skewed Probability Distributions: Examples - March 21, 2023
- Maximum Likelihood Estimation: Concepts, Examples - March 20, 2023
Leave a Reply