Tag Archives: python

How to Convert Sklearn Dataset to Dataframe

In this post, you will learn how to convert Sklearn.datasets to Pandas Dataframe. It will be useful to know this technique (code example) if you are comfortable working with Pandas Dataframe. You will be able to perform several operations faster with the dataframe. Sklearn datasets class comprises of several different types of datasets including some of the following: Iris Breast cancer Diabetes Boston Linnerud Images The code sample below is demonstrated with IRIS data set. Before looking into the code sample, recall that IRIS dataset when loaded has data in form of “data” and labels present as “target”. Executing the above code will print the following dataframe. In case, you don’t want to explicitly assign …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , .

Sklearn SVM Classifier using LibSVM – Code Example

In this post, you learn about Sklearn LibSVM implementation used for training an SVM classifier, with code example.  Here is a great guide for learning SVM classification, especially, for beginners in the field of data science/machine learning. LIBSVM is a library for Support Vector Machines (SVM) which provides an implementation for the following: C-SVC (Support Vector Classification) nu-SVC epsilon-SVR (Support Vector Regression) nu-SVR Distribution estimation (one-class SVM) In this post, you will see code examples in relation to C-SVC, and nu-SVC LIBSVM implementations. I will follow up with code examples for SVR and distribution estimation in future posts. Here are the links to their SKLearn pages for C-SVC and nu-SVC …

Continue reading

Posted in AI, Data Science, Machine Learning, Python. Tagged with , , .

SVM Classifier using Scikit Learn – Code Examples

In this post, you will learn about how to train an SVM Classifier using Scikit Learn or SKLearn implementation with the help of code examples/samples.  Scikit Learn offers different implementations such as the following to train an SVM classifier.  LIBSVM: LIBSVM is a C/C++ library specialised for SVM. The SVC class is the LIBSVM implementation and can be used to train the SVM classifier (hard/soft margin classifier). Native Python implementation: Scikit Learn provides python implementation of SVM classifier in form SGDClassifier which is based on a stochastic gradient algorithm. LIBSVM SVC Code Example In this section, the code below makes use of SVC class (from sklearn.svm import SVC) for fitting a model. SVM Python Implementation …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Classification Model with SVM Classifier – Python Example

In this post, you will get an access to Python code example for building a machine learning classification model using SVM (Support Vector Machine) classifier algorithm. We will work with Python Sklearn package for building the model. The following steps will be covered for training the model using SVM: Load the data Create training and test split Perform feature scaling Instantiate an SVC classifier Fit the model Measure the model performance First and foremost we will load appropriate Sklearn modules and classes. Lets get started with loading the data set and creating the training and test split from the data set. Pay attention to the stratification aspect used when creating the training and test split. The train_test_split class of sklearn.model_selection …

Continue reading

Posted in AI, Data Science, Machine Learning, Python. Tagged with , , .

Python – Training a Model using Logistic Regression

In this post, you will learn about how to train a model using machine learning algorithm such as Logistic Regression. Here is the code we can use for fitting a model using Logistic Regression. We will use IRIS data set for training the model. Loading SkLearn Modules / Classes First and foremost, we will load the appropriate packages, sklearn modules and classes. Data Loading As a next step, we will load the dataset and do the data preparation. Create Training / Test Data Next step is to create a train and test split. Note the stratification parameter. This is used to ensure that class distribution in training / test split remains consistent …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Python – How to Plot Learning Curves of Classifier

Perceptron Classifier Learning Curve using Python Mlxtend Package

In this post, you will learn a technique using which you could plot the learning curve of a machine learning classification model. As a data scientist, you will find the Python code example very handy. In this post, the plot_learning_curves class of mlxtend.plotting module from mlxtend package is used. This package is created by Dr. Sebastian Raschka.  Lets train a Perceptron model using iris data from sklearn.datasets. The accuracy of the model comes out to be 0.956 or 95.6%. Next, we will want to see how did the learning go.  In order to do that, we will use plot_learning_curves class of mlxtend.plotting module. Here is a post on how to install mlxtend with Anaconda. The following …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Feature Scaling & Stratification for Model Performance (Python)

In this post, you will learn about how to improve machine learning models performance using techniques such as feature scaling and stratification. The following topics are covered in this post. The concepts have been explained using Python code samples. What is feature scaling and why one needs to do it? What is stratification? Training Perceptron model without feature scaling and stratification Training Perceptron model with feature scaling Training Perceptron model with feature scaling and stratification What is Feature Scaling and Why is it needed? Feature scaling is a technique of standardizing the features present in the data in a fixed range. This is done when data consists of features of varying …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , .

Python – Improve Model Performance using Feature Scaling

In this post you will learn about a simple technique namely feature scaling using which you could improve machine learning models. The models will be trained using Perceptron (single-layer neural network) classifier. First and foremost, lets quickly understand what is feature scaling and why one needs it? What is Feature Scaling and Why does one need it? Feature Scaling is a technique to standardize the independent features present in the data in a fixed range. This is performed when the dataset contains features that are highly varying in magnitudes, units and range. In this post, we will learn to use Standardization technique for feature scaling. We will use the StandardScaler from …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , .

How to use Sklearn Datasets For Machine Learning

In this post, you wil learn about how to use Sklearn datasets for training machine learning models. Here is a list of different types of datasets which are available as part of sklearn.datasets Iris (Iris plant datasets used – Classification) Boston (Boston house prices – Regression) Wine (Wine recognition set – Classification) Breast Cancer (Breast cancer wisconsin diagnostic – Classification) Digits (Optical recognition of handwritten digits dataset – Classification) Linnerud (Linnerrud dataset – Classification) Diabetes (Diabetes – Regression) The following command could help you load any of the datasets: All of the datasets come with the following and are intended for use with supervised learning: Data (to be used for …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , , .

Python – How to install mlxtend in Anaconda

Add Channel and Install Mlxtend using Conda Install

In this post, you will quickly learn about how to install mlxtend python package while you are working with Anaconda Jupyter Notebook. Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. This library is created by Dr. Sebastian Raschka, an Assistant Professor of Statistics at the University of Wisconsin-Madison focusing on deep learning and machine learning research. Here is the instruction for installing within your Anaconda.  Add a channel namely conda-forge by clicking on Channels button and then Add button. Open a command prompt and execute the following command: conda install mlxtend –channel Conda-forge Once installed, launch a Jupyter Notebook and try importing the following. This should work …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Python – Scatter Plot Different Classes

Scatter Plot representing Two Classes

In this post, you will learn about the how to create scatter plots using Python which represents two or more classes while you are trying to solve machine learning classification problem. As you work on the classification problem, you want to understand whether classes are linearly separable or they are non-linear. In other words, whether the classification problem is linear or non-linear. This, in turn, helps you decide on what kind of machine learning classification algorithms you might want to use. In this post, you will learn how to use scatter plot to identify whether two or more classes are linearly separable or not. You may want to check what, when and how of scatter plot matrix which can also be used to determine whether the data …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , .

Python DataFrame – Assign New Labels to Columns

Python Dataframe Columns - Labels assigned new value

In this post, you will get a code sample related to how to assign new labels to columns in python programming while training machine learning models.  This is going to be very helpful when working with classification machine learning problem. Many a time the labels for response or dependent variable are in text format and all one wants is to assign a number such as 0, 1, 2 etc instead of text labels. Beginner-level data scientists will find this code very handy. We will look at the code for the dataset as represented in the diagram below: In the above code, you will see that class labels are named as very_low, Low, High, Middle …

Continue reading

Posted in AI, Data Science, Machine Learning, News. Tagged with , , .

How to Print Unique Values in Pandas Dataframe Columns

print unique column values in Pandas dataframe

A quick post representing code sample on how to print unique values in Dataframe columns in Pandas. Here is a data frame comprising of oil prices on different dates which column such as year comprising of repeated/duplicate value of years. In the above data frame, the requirement is to print the unique value of year column. Here is the code for same. Note the method unique()

Posted in AI, Data Science, Machine Learning, News, Python. Tagged with , , .

Confusion Matrix Explained with Python Code Examples

confusion matrix for classification model

In this post, you will learn about the confusion matrix with examples and how it could be used as performance metrics for classification models in machine learning. Let’s take an example of a classification model which is used to predict whether a person would default on a bank loan. To build this classification model, let’s say, a historical data set of 10000 records got chosen for building the model. As part of building the model, all of the 10,000 records got labeled where each record represented a person and got labeled as “Yes” or “No” based on whether they defaulted (Yes) or not defaulted (No). Out of 10,000 labeled records, …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , , .

Python Quick Coding Tutorials for Experienced Developers

python tutorials for experienced developers

Learning Python has taken centerstage for many developers as Python is one of the key language for working in the field of data science/machine learning. If you are an experienced developer, this post would help you quickly get started with Python programming. In this post, you will quickly learn some of the following in relation to Python programming: Data types Input/output operations Defining functions Conditional expressions Looping constructs String functions Defining Module Defining Classes Exception handling Python Programming Concepts Data types: Python interprets and declares variables when they are equated to a value. The following represents how variables are casted to specific data types. float(variable): Casts variable to float int(variable): …

Continue reading

Posted in Python, Tutorials. Tagged with .