Tag Archives: machine learning

Keras Hello World Example

Keras Hello World

In this post, you will learn about how to set up Keras and get started with Keras, one of the most popular deep learning frameworks in current times which is built on top of TensorFlow 2.0 and can scale to large clusters of GPUs. You will also learn about getting started with hello world program with Keras code example. Here are some of the topics which will be covered in this post: Set up Keras with Anaconda Keras Hello World Program Set up Keras with Anaconda In this section, you will learn about how to set up Keras with Anaconda. Here are the steps: Go to Environments page in Anaconda App. …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , , , .

Handling Class Imbalance using Sklearn Resample

In this post, you will learn about how to tackle class imbalance issue when training machine learning classification models with imbalanced dataset. This is illustrated using Python SKlearn example. In the same context, you may check out my earlier post on handling class imbalance using class_weight. As a data scientist, it is of utmost importance to learn some of these techniques as you will often come across the class imbalance problem while working on different classification problems. Here is how the class imbalance in the dataset can be visualized: Before going ahead and looking at the Python code example related to how to use Sklearn.utils resample method, lets create an imbalanced data …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Handle Class Imbalance using Class Weight – Python

In this post, you will learn about how to tackle with or handle class imbalance by adjusting class weight while solving a machine learning classification problem. This will be illustrated using Sklearn Python code example. What is Class Imbalance? Class imbalance is a one of the most common problem when solving classification problems related to healthcare domain, banking (fraud) domain etc. For example, if you want to build a model which classifies a transaction to be fraud or otherwise, the dataset will be highly imbalanced as there won’t be many instances where fraud-related transactions is found. The challenge related to building models having high performance is to address highly skewed data …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Micro-average & Macro-average Scoring Metrics – Python

In this post, you will learn about how to use micro-averaging and macro-averaging methods for evaluating scoring metrics (precision, recall, f1-score) for multi-class classification machine learning problem. You will also learn about weighted precision, recall and f1-score metrics in relation to micro-average and macro-average scoring metrics for multi-class classification problem. The concepts will be explained with Python code examples.  What & Why of Micro and Macro-averaging scoring metrics? With binary classification, it is very intuitive to score the model in terms of scoring metrics such as precision, recall and F1-score. However, in case of multi-class classification it becomes tricky. The questions to ask are some of the following: Which metrics to use to score …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

PyTorch – How to Load & Predict using Resnet Model

In this post, you will learn about how to load and predict using pre-trained Resnet model using PyTorch library. Here is arxiv paper on Resnet. Before getting into the aspect of loading and predicting using Resnet (Residual neural network) using PyTorch, you would want to learn about how to load different pretrained models such as AlexNet, ResNet, DenseNet, GoogLenet, VGG etc. The PyTorch Torchvision projects allows you to load the models. Note that the torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. Here is the command:  The output of above will list down all the pre-trained models available for loading and prediction. You may …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning, Python. Tagged with , , , , .

How to install PyTorch on Anaconda

This is a quick post on how to install PyTorch on Anaconda and get started with deep learning projects. As a machine learning enthusiasts, this is the first step in getting started with PyTorch. I followed this steps on Mac Air and got started with PyTorch in no time. Here are the steps: Go to Anaconda tool. Click on “Environments” in the left navigation. Click on arrow marks on “base (root)” as shown in the diagram below. It will open up a small modal window as down. Click open terminal. This will open up a terminal window.   Execute the following command to set up PyTorch. Once done, go to Jupyter Notebook window and …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , , , .

ROC Curve & AUC Explained with Python Examples

In this post, you will learn about ROC Curve and AUC concepts along with related concepts such as True positive and false positive rate with the help of Python examples. It is very important to learn ROC, AUC and related concepts as it helps in selecting the most appropriate machine learning models based on the model performance.  What is ROC & AUC / AUROC? Receiver operating characteristic (ROC) graphs are used for selecting the most appropriate classification models based on their performance with respect to the false positive rate (FPR) and true positive rate (TPR). These metrics are computed by shifting the decision threshold of the classifier. ROC curve is used for probabilistic models …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Python – How to Draw Confusion Matrix using Matplotlib

In this post, you will learn about how to draw / show confusion matrix using Matplotlib Python package. It is important to learn this technique as it will come very handy in assessing the machine learning model performance of classification models trained using different classification algorithms. Confusion Matrix using Matplotlib In order to demonstrate the confusion matrix using Matplotlib, let’s fit a pipeline estimator to the Sklearn breast cancer dataset using StandardScaler (for standardising the dataset) and Random Forest Classifier as the machine learning algorithm.  Once an estimator is fit to the training data set, nest step is to print the confusion matrix. In order to do that, the following steps will need to be …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Accuracy, Precision, Recall & F1-Score – Python Examples

In this post, you will learn about how to calculate machine learning model performance metrics such as some of the following scores while assessing the performance of the classification model. The concepts is illustrated using Python Sklearn example. Accuracy score Precision score Recall score F1-Score As a data scientist, you must get a good understanding of concepts related to the above in relation to measuring classification model performance. Lets work with Sklearn datasets for breast cancer. You can load the dataset using the following code: The target labels in the breast cancer dataset is Benign (1) and Malignant (0). There are 212 records with label as malignant and 357 records with …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Python – Nested Cross Validation for Algorithm Selection

In this post, you will learn about nested cross validation technique and how you could use it for selecting the most optimal algorithm out of two or more algorithms used to train machine learning model. The usage of nested cross validation technique is illustrated using Python Sklearn example. When it is about selecting models trained with a particular algorithm with most optimal combination of hyper parameters, you can adopt the model tuning techniques such as some of the following: Grid search  Randomized search Validation curve The following topics get covered in this post: Why nested cross-validation? Nested cross-validation with Python Sklearn example Why Nested Cross-Validation? Nested cross-validation technique is used for estimating …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Randomized Search Explained – Python Sklearn Example

randomized search python sklearn example

In this post, you will learn about one of the machine learning model tuning technique called Randomized Search which is used to find the most optimal combination of hyper parameters for coming up with the best model. The randomized search concept will be illustrated using Python Sklearn code example. As a data scientist, you must learn some of these model tuning techniques to come up with most optimal models. You may want to check some of the other posts on tuning model parameters such as the following: Sklearn validation_curve for tuning model hyper parameters  Sklearn GridSearchCV for tuning model hyper parameters In this post, the following topics will be covered: What and why …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Grid Search Explained – Python Sklearn Examples

GridSearchCV Python Sklearn Examples

In this post, you will learn about another machine learning model hyperparameter optimization technique called as Grid Search with the help of Python Sklearn code examples. In one of the earlier posts, you learned about another hyperparamater optimization technique namely validation curve. As a data scientist, it will be useful to learn some of these model tuning techniques (tuning hyperparameters) as it would help us select most appropriate models with most appropriate parameters.  The following are some of the topics covered in this post: What & Why of grid search? Grid search with Python Sklearn examples What & Why of Grid Search? Grid Search technique helps in performing exhaustive search over specified parameter (hyper parameters) values for …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Validation Curves Explained – Python Sklearn Example

In this post, you will learn about validation curves with Python Sklearn example. You will learn about how validation curves can help diagnose or assess your machine learning models in relation to underfitting and overfitting. On the similar topic, I recommend you reading one of the previous post on assessing overfitting and underfitting titled Learning curves explained with Python Sklearn example. The following gets covered in this post: Why validation curves? Python Sklearn example for validation curves Why Validation Curves? As like learning curve, the validation curve also helps in diagnozing the model bias vs variance. The validation curve plot helps in selecting most appropriate model parameters (hyper-parameters). Unlike learning …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Learning Curves Explained with Python Sklearn Example

Learning curve explained with python example

In this post, you will learn about how to use learning curves in learning curves using Python code (Sklearn) example to determine model bias-variance. Knowing how to use learning curves will help you assess/diagnose whether the model is suffering from high bias (underfitting) or high variance (overfitting) and whether increasing training data samples could help solve the bias or variance problem.  Some of the following topics are covered in this post: Why learning curves? Python Sklearn example for the Learning curve You may want to check some of the following posts in order to get a better understanding of bias-variance and underfitting-overfitting. Bias-variance concepts and interview questions Overfitting/Underfitting concepts and interview …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Logistic Regression Quiz Questions & Answers

Logistic regression quiz question and answers

In this post, you will learn about Logistic Regression terminologies / glossary with quiz / practice questions. For machine learning Engineers  or data scientists wanting to test their understanding of Logistic regression or preparing for interviews, these concepts and related quiz questions and answers will come handy. Here is a related post, 30 Logistic regression interview practice questions I have posted earlier. Here are some of the questions and answers discussed in this post: What are different names / terms  used in place of Logistic regression? Define Logistic regression in simple words? Define logistic regression in terms of logit? Define logistic function?  What does training a logistic regression model mean? What are different types …

Continue reading

Posted in Data Science, Machine Learning, Quiz. Tagged with , , , .

K-Fold Cross Validation – Python Example

K-Fold Cross Validation Concepts with Python and Sklearn Code Example

In this post, you will learn about K-fold Cross Validation concepts with Python code example. It is important to learn the concepts cross validation concepts in order to perform model tuning with an end goal to choose model which has the high generalization performance. As a data scientist / machine learning Engineer, you must have a good understanding of the cross validation concepts in general.  The following topics get covered in this post: What and why of K-fold cross validation  When to select what values of K? K-fold cross validation with python (using cross-validation generators) K-fold cross validation with python (using cross_val_score) What and Why of K-fold Cross Validation K-fold cross validation …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .