# Category Archives: Data Science

## Python Sklearn – How to Generate Random Datasets

In this post, you will learn about some useful random datasets generators provided by Python Sklearn. There are many methods provided as part of Sklearn.datasets package. In this post, we will take the most common ones such as some of the following which could be used for creating data sets for doing proof-of-concepts solution for regression, classification and clustering machine learning algorithms. As data scientists, you must get familiar with these methods in order to quickly create the datasets for training models using different machine learning algorithms. Methods for generating datasets for Classification Methods for generating datasets for Regression Methods for Generating Datasets for Classification The following is the list of …

## Neural Networks and Mathematical Models Examples

In this post, you will learn about concepts of neural networks with the help of mathematical models examples. In simple words, you will learn about how to represent the neural networks using mathematical equations. As a data scientist / machine learning researcher, it would be good to get a sense of how the neural networks can be converted into a bunch of mathematical equations for calculating different values. Having a good understanding of representing the activation function output of different computation units / nodes / neuron in different layers would help in understanding back propagation algorithm in a better and easier manner. This will be dealt in one of the …

## Adaline Explained with Python Example

In this post, you will learn the concepts of Adaline (ADAptive LInear NEuron), a machine learning algorithm, along with Python example.As like Perceptron, it is important to understand the concepts of Adaline as it forms the foundation of learning neural networks. The concept of Perceptron and Adaline could found to be useful in understanding how gradient descent can be used to learn the weights which when combined with input signals is used to make predictions based on unit step function output. Here are the topics covered in this post in relation to Adaline algorithm and its Python implementation: What’s Adaline? Adaline Python implementation Model trained using Adaline implementation What’s Adaline? …

## Perceptron Explained using Python Example

In this post, you will learn about the concepts of Perceptron with the help of Python example. It is very important for data scientists to understand the concepts related to Perceptron as a good understanding lays the foundation of learning advanced concepts of neural networks including deep neural networks (deep learning). In this post, the following topics are covered: What is Perceptron? Perceptron Python code example What is Perceptron? Perceptron is a machine learning algorithm which mimics how a neuron in the brain works. It is also called as single layer neural network as the output is decided based on the outcome of just one activation function which represents a neuron. Let’s first understand …

## Stochastic Gradient Descent Python Example

In this post, you will learn the concepts of Stochastic Gradient Descent using Python example. In order to demonstrate Stochastic gradient descent concepts, Perceptron machine learning algorithm is used. Recall that Perceptron is also called as single-layer neural network. Before getting into details, lets quickly understand the concepts of Perceptron and underlying learning algorithm such SGD is used. You may want to check out the concepts of gradient descent on this page – Gradient Descent explained with examples. The following topics are covered in this post: Stochastic Gradient Descent (SGD) for Learning Perceptron Model Perceptron algorithm can be used to train binary classifier that classifies the data as either 1 or 0. …

## Python Implementations of Machine Learning Models

This post highlights some great pages where python implementations for different machine learning models can be found. If you are a data scientist who wants to get a fair idea of whats working underneath different machine learning algorithms, you may want to check out the Ml-from-scratch page. The top highlights of this repository are python implementations for the following: Supervised learning algorithms (linear regression, logistic regression, decision tree, random forest, XGBoost, Naive bayes, neural network etc) Unsupervised learning algorithms (K-means, GAN, Gaussian mixture models etc) Reinforcement learning algorithms (Deep Q Network) Dimensionality reduction techniques such as PCA Deep learning Examples that make use of above mentioned algorithms Here is an insight into …

## Lasso Regression Explained with Python Example

In this post, you will learn concepts of Lasso regression along with Python Sklearn examples. Lasso regression algorithm introduces penalty against model complexity (large number of parameters) using regularization parameter. Other two similar form of regularized linear regression are Ridge regression and Elasticnet regression which will be discussed in future posts. In this post, the following topics are discussed: What’s Lasso regression? Lasso regression python example Lasso regression cross validation python example What’s Lasso Regression? LASSO stands for least absolute shrinkage and selection operator. Pay attention to words, “least absolute shrinkage” and “selection”. We will refer it shortly. Lasso regression is also called as L1-norm regularization. Lasso regression is an extension …

## Python – Extract Text from HTML using BeautifulSoup

In this post, you will learn about how to use Python BeautifulSoup and NLTK to extract words from HTML pages and perform text analysis such as frequency distribution. The example in this post is based on reading HTML pages directly from the website and performing text analysis. However, you could also download the web pages and then perform text analysis by loading pages from local storage. Python Code for Extracting Text from HTML Pages Here is the Python code for extracting text from HTML pages and perform text analysis. Pay attention to some of the following in the code given below: URLLib request is used to read the html page …

## Top 10 Data Science Skills for Product Managers

In this post, you will learn about some of the top data science skills / concepts which may be required for product managers / business analyst to have, in order to create useful machine learning based solutions. Here are some of the topics / concepts which need to be understood well by product managers / business analysts in order to tackle day-to-day challenges while working with data science / machine learning teams. Knowing these concepts will help product managers / business analyst acquire enough skills in order to solve machine learning based problems. Understanding the difference between AI, machine learning, data science, deep learning Which problems are machine learning problems? …

## 8 Key AI Challenges for Telemedicine / Telehealth

In this post, you will learn about some of key challenges of implementing Telemedicine / Telehealth. In case you are working in the field of data science / machine learning, you may want to go through some of the challenges, primarily AI related, which is thrown in Telemedicine domain due to upsurge in need of reliable Telemedicine services. Here are the slides I recently presented in Digital Data Science Conclave hosted by KIIT University. The primary focus is to make sure appropriate controls are in place to make responsible use of AI (Responsible AI). Here are the top 8 challenges which need to be addressed to take full advantage of AI, RPA …

## RANSAC Regression Explained with Python Examples

In this post, you will learn about the concepts of RANSAC regression algorithm along with Python Sklearn example for RANSAC regression implementation using RANSACRegressor. RANSAC regression algorithm is useful for handling the outliers dataset. Instead of taking care of outliers using statistical and other techniques, one can use RANSAC regression algorithm which takes care of the outlier data. In this post, the following topics are covered: Introduction to RANSAC regression RANSAC Regression Python code example Introduction to RANSAC Regression RANSAC (RANdom SAmple Consensus) algorithm takes linear regression algorithm to the next level by excluding the outliers in the training dataset. The presence of outliers in the training dataset does impact …

## Mean Squared Error or R-Squared – Which one to use?

In this post, you will learn about the concepts of mean-squared error (MSE) and R-squared, difference between them and which one to use when working with regression models such as linear regression model. You also learn Python examples to understand the concepts in a better manner. In this post, the following topics are covered: Introduction to Mean Squared Error (MSE) and R-Squared Difference between MSE and R-Squared MSE or R-Squared – Which one to use? MSE and R-Squared Python code example Introduction to Mean Square Error (MSE) and R-Squared In this section, you will learn about the concepts of mean squared error and R-squared. These are used for evaluating the …

## Linear Regression Explained with Python Examples

In this post, you will learn about concepts of linear regression along with Python Sklearn examples for training linear regression models. Linear regression belongs to class of parametric models and used to train supervised models. The following topics are covered in this post: Introduction to linear regression Linear regression concepts / terminologies Linear regression python code example Introduction to Linear Regression Linear regression is a machine learning algorithm used to predict the value of continuous response variable. The predictive analytics problems that are solved using linear regression models are called as supervised learning problems as it requires that the value of response / target variables must be present and used for training the models. …

## Correlation Concepts, Matrix & Heatmap using Seaborn

In this post, you will learn about the concepts of Correlation and how to draw Correlation Heatmap using Python Seaborn library for different columns in Pandas dataframe. The following are some of the topics covered in this post: Introduction to Correlation What is correlation heatmap? Corrleation heatmap Pandas / Seaborn python example Introduction to Correlation Correlation is a term used to represent the statistical measure of linear relationship between two variables. It can also be defined as the measure of dependence between two different variables. If there are multiple variables and the goal is to find correlation between all of these variables and store them using appropriate data structure, the …

## Beta Distribution Explained with Python Examples

In this post, you will learn about Beta probability distribution with the help of Python examples. As a data scientist, it is very important to understand beta distribution as it is used very commonly as prior in Bayesian modeling. In this post, the following topics get covered: Beta distribution intuition and examples Introduction to beta distribution Beta distribution python examples Beta Distribution Intuition & Examples Beta distribution is widely used to model the prior beliefs or probability distribution in real world applications. Here is a great article on understanding beta distribution with an example of baseball game. You may want to pay attention to the fact that even if the baseball …

## Bernoulli Distribution Explained with Python Examples

In this post, you will learn about the concepts of Bernoulli Distribution along with real-world examples and Python code samples. As a data scientist, it is very important to understand statistical concepts around various different probability distributions to understand the data distribution in a better manner. In this post, the following topics will get covered: Introduction to Bernoulli distribution Bernoulli distribution real-world examples Bernoulli distribution python code examples Introduction to Bernoulli Distribution Bernoulli distribution is a discrete probability distribution representing the discrete probabilities of a random variable which can take only one of the two possible values such as 1 or 0, yes or no, true or false etc. The probability of …