# Category Archives: Data Science

## Support Vector Machine (SVM) Python Example

In this post, you will learn about the concepts of Support Vector Machine (SVM) with the help of Python code example for building a machine learning classification model. We will work with Python Sklearn package for building the model. As data scientists, it is important to get a good grasp on SVM algorithm and related aspects. What is Support Vector Machine (SVM)? Support vector machine (SVM) is a supervised machine learning algorithm that can be used for both classification and regression tasks. At times, SVM for classification is termed as support vector classification (SVC) and SVM for regression is termed as support vector regression (SVR). In this post, we will learn about SVM …

## Overfitting & Underfitting in Machine Learning

The performance of the machine learning models depends upon two key concepts called underfitting and overfitting. In this post, you will learn about some of the key concepts of overfitting and underfitting in relation to machine learning models. In addition, you will also get a chance to test your understanding by attempting the quiz. The quiz will help you prepare well for interview questions in relation to underfitting & overfitting. As data scientists, you must get a good understanding of the overfitting and underfitting concepts. Introduction to Overfitting & Underfitting Assuming an independent and identically distributed (I.I.d) dataset, when the prediction error on both the training and validation dataset is …

## Spend Analytics Use Cases: AI & Data Science

In this post, you will learn about the high-level concepts of spend analytics in relation to procurement and how data science / machine learning & AI can be used to extract actionable insights as part of spend analytics. This will be useful for procurement professionals such as category managers, sourcing managers, and procurement analytics stakeholders looking to understand the concepts of spend analytics and how they can drive decisions based on spend analytics. What is Spend Analytics? Simply speaking, spend analytics is about performing systematic computational analysis to extract actionable insights from spend and savings data across different categories of spends in order to achieve desired business outcomes such as cost savings, …

## Logistic Regression Explained with Python Example

In this blog post, we will discuss the logistic regression machine learning algorithm with a python example. Logistic regression is a type of regression algorithm that is used to predict the probability of occurrence of an event. It is often used in machine learning applications. In this tutorial, we will use python to implement logistic regression for binary classification problems. What is Logistic Regression? Logistic regression is a machine learning algorithm used for classification problems. That is, it can be used to predict whether an instance belongs to one class or the other. For example, it could be used to predict whether a person is male or female, based on …

## Perceptron Explained using Python Example

In this post, you will learn about the concepts of Perceptron with the help of Python example. It is very important for data scientists to understand the concepts related to Perceptron as a good understanding lays the foundation of learning advanced concepts of neural networks including deep neural networks (deep learning). What is Perceptron? Perceptron is a machine learning algorithm which mimics how a neuron in the brain works. It is also called as single layer neural network consisting of a single neuron. The output of this neural network is decided based on the outcome of just one activation function associated with the single neuron. In perceptron, the forward propagation of information happens. Deep …

## Classification Problems Real-life Examples

In this post, you will learn about some popular and most common real-life examples of machine learning classification problems. For beginner data scientists, these examples will prove to be helpful to gain perspectives on real-world problems which can be termed as machine learning classification problems. This post will be updated from time-to-time to include interesting real-life examples which can be solved by training machine learning classification models. Before going ahead and looking into examples, let’s understand a little about what is machine learning (ML) classification problem. You may as well skip this section if you are familiar with the definition of machine learning classification problems & solutions. You may want …

## Linear vs Non-linear Data: How to Know

In this post, you will learn the techniques in relation to knowing whether the given data set is linear or non-linear. Based on the type of machine learning problems (such as classification or regression) you are trying to solve, you could apply different techniques to determine whether the given data set is linear or non-linear. For a data scientist, it is very important to know whether the data is linear or not as it helps to choose appropriate algorithms to train a high-performance model. You will learn techniques such as the following for determining whether the data is linear or non-linear: Use scatter plot when dealing with classification problems Use …

## Python – Creating Scatter Plot with IRIS Dataset

In this blog post, we will be learning how to create a Scatter Plot with the IRIS dataset using Python. The IRIS dataset is a collection of data that is used to demonstrate the properties of various statistical models. It contains information about 50 observations on four different variables: Petal Length, Petal Width, Sepal Length, and Sepal Width. As data scientists, it is important for us to be able to visualize the data that we are working with. Scatter plots are a great way to do this because they show the relationship between two variables. In this post, we have plotted and explored how how Petal Length and Sepal Length …

## Neural Network Explained with Perceptron Example

Neural networks are an important part of machine learning, so it is essential to understand how they work. A neural network is a computer system that has been modeled based on a biological neural network comprising neurons connected with each other. It can be built to solve machine learning tasks, like classification and regression problems. The perceptron algorithm is a representation of how neural networks work. The artificial neurons were first proposed by Frank Rosenblatt in 1957 as models for the human brain’s perception mechanism. This post will explain the basics of neural networks with a perceptron example. You will understand how a neural network is built using perceptrons. This …

## Chi-square test – Types, Concepts, Examples

The Chi-square (χ2) test is a statistical test used to determine whether the distribution of observed data is consistent with the distribution of data expected under a particular hypothesis. The Chi-square test can be used to compare two distributions, or to assess the goodness of fit of a given distribution to observed data. In this blog post, we will discuss the types of Chi-square tests, the concepts behind them, and how to perform them using Python / R. As data scientists, it is important to have a strong understanding of the Chi-square test so that we can use it to make informed decisions about our data. We will also provide …

## Hypothesis Testing Steps & Real Life Examples

Hypothesis testing is a technique that helps scientists, researchers, or for that matter, anyone test the validity of their claims or hypotheses about real-world or real-life events. Hypothesis testing techniques are often used in statistics and data science to analyze whether the claims about the occurrence of the events are true, whether the results returned by performance metrics of machine learning models are representative of the models or they happened by chance. This blog post will cover some of the key statistical concepts including steps and examples in relation to what is hypothesis testing, and, how to formulate them. The knowledge of hypothesis formulation and hypothesis testing holds the key …

## Insurance Machine Learning Use Cases

As insurance companies face increasing competition and ever-changing customer demands, they are turning to machine learning for help. Machine learning / AI can be used in a variety of ways to improve insurance operations, from developing new products and services to improving customer experience. It would be helpful for product manager and data science architects to get a good understanding around some of the use cases which can be addressed / automated using machine learning / AI based solutions. In this blog post, we will explore some of the most common insurance machine learning / AI use cases. Stay tuned for future posts that will dive into each of these …

## Invoice Processing Machine Learning Use Cases

Invoice processing is a critical part of any business. It’s the process of creating, managing, and paying invoices. Without invoice processing, businesses would have a difficult time keeping track of their finances. There are many different invoice processing use cases. For example, businesses can use invoice processing to keep track of customer payments, manage vendor contracts, and streamline their accounting processes. Invoice processing can also be used to detect fraud and prevent errors. Machine learning / AI can be used to improve invoice processing in a number of ways. As a product manager, it will be helpful to understand these use cases and how machine learning can be used to …

## Tail Spend Management & Spend Analytics

Do you know where your business is spending its money? And more importantly, do you know where your business SHOULD be spending its money? Many businesses don’t have a good handle on their tail spend – the money that’s spent on things that are not essential to the core operations of the company. Tail spend can be difficult to track and manage, but with the help of spend analytics tools and machine learning, it’s becoming easier than ever before. In this blog post, we’ll discuss what tail spend is, how to track it, and how to use analytics and machine learning to make better decisions about where to allocate your …

## Procurement Advanced Analytics Use Cases

The procurement analytics applications are poised to grow exponentially in the next few years. With so much data available and the need for digital transformation across procurement organization, it’s important to know how procurement analytics can help you make better business decisions. This blog will cover procurement analytics and key use cases of advanced analytics that will be useful for business stakeholders such as category managers, sourcing managers, supplier relationship managers, business analysts / product managers, and data scientists implement different use cases using machine learning. Procurement analytics will allow you to use data very effectively in achieving data-driven decision making. One can get started with procurement analytics with focus …

## When to Use Z-test vs T-test: Differences, Examples

When it comes to statistical tests, z-test and t-test are two of the most commonly used. But what is the difference between z-test and t-test? And when should you use Z-test vs T-test? In this blog post, we will answer all these questions and more! We will start by explaining the difference between z-test and t-test in terms of their formulas. Then we will go over some examples so that you can see how each test is used in practice. As data scientists, it is important to understand the difference between z-test and t-test so that you can choose the right test for your data. Let’s get started! Difference between …