Tag Archives: Data Science

Spend Analytics Use Cases: AI & Data Science

What is spend analytics

In this post, you will learn about the high-level concepts of spend analytics in relation to procurement and how data science / machine learning & AI can be used to extract actionable insights as part of spend analytics. This will be useful for procurement professionals such as category managers, sourcing managers, and procurement analytics stakeholders looking to understand the concepts of spend analytics and how they can drive decisions based on spend analytics. What is Spend Analytics? Simply speaking, spend analytics is about performing systematic computational analysis to extract actionable insights from spend and savings data in order to achieve desired business outcomes such as cost savings, cost avoidance, spend forecasting, spend …

Continue reading

Posted in Data Science, Machine Learning, Procurement. Tagged with , .

Softmax Regression Explained with Python Example

In this post, you will learn about the concepts of what is Softmax regression/function with Python code examples and why do we need them? As data scientist/machine learning enthusiasts, it is very important to understand the concepts of Softmax regression as it helps in understanding the algorithms such as neural networks, multinomial logistic regression, etc in a better manner. Note that the Softmax function is used in various multiclass classification machine learning algorithms such as multinomial logistic regression (thus, also called softmax regression), neural networks, etc. Before getting into the concepts of softmax regression, let’s understand what is softmax function. What’s Softmax function? Simply speaking, the Softmax function converts raw …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , , .

Cross Entropy Loss Explained with Python Examples

In this post, you will learn the concepts related to the cross-entropy loss function along with Python code examples and which machine learning algorithms use the cross-entropy loss function as an objective function for training the models. Cross-entropy loss is used as a loss function for models which predict the probability value as output (probability distribution as output). Logistic regression is one such algorithm whose output is a probability distribution. You may want to check out the details on how cross-entropy loss is related to information theory and entropy concepts – Information theory & machine learning: Concepts What’s Cross-Entropy Loss? The cross-entropy loss function is an optimization function that is …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Classification Problems Real-life Examples

classification problems real life examples

In this post, you will learn about some popular and most common real-life examples of machine learning classification problems. For beginner data scientists, these examples will prove to be helpful to gain perspectives on real-world problems which can be termed as machine learning classification problems. This post will be updated from time-to-time to include interesting real-life examples which can be solved by training machine learning classification models. Before going ahead and looking into examples, let’s understand a little about what is machine learning (ML) classification problem. You may as well skip this section if you are familiar with the definition of machine learning classification problems & solutions.  You may want …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Linear Regression Explained with Python Examples

SSR, SSE and SST Representation in relation to Linear Regression

In this post, you will learn about concepts of linear regression along with Python Sklearn examples for training linear regression models. Linear regression belongs to class of parametric models and used to train supervised models.  The following topics are covered in this post: Introduction to linear regression Linear regression concepts / terminologies Linear regression python code example Introduction to Linear Regression Linear regression is a machine learning algorithm used to predict the value of continuous response variables. The predictive analytics problems that are solved using linear regression models are called supervised learning problems as it requires that the value of response/target variables must be present and used for training the models. Also, recall that …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Mean Squared Error or R-Squared – Which one to use?

Mean Squared Error Representation

In this post, you will learn about the concepts of the mean-squared error (MSE) and R-squared, the difference between them, and which one to use when evaluating the linear regression models. You also learn Python examples to understand the concepts in a better manner What is Mean Squared Error (MSE)? The Mean squared error (MSE) represents the error of the estimator or predictive model created based on the given set of observations in the sample. Intuitively, the MSE is used to measure the quality of the model based on the predictions made on the entire training dataset vis-a-vis the true label/output value. In other words, it can be used to …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Linear Regression Explained with Real Life Example

Multiple linear regression example

In this post, the linear regression concept in machine learning is explained with multiple real-life examples. Both types of regression models (simple/univariate and multiple/multivariate linear regression) are taken up for sighting examples. In case you are a machine learning or data science beginner, you may find this post helpful enough. You may also want to check a detailed post on what is machine learning – What is Machine Learning? Concepts & Examples. What is Linear Regression? Linear regression is a machine learning concept that is used to build or train the models (mathematical models or equations)  for solving supervised learning problems related to predicting continuous numerical value. Supervised learning problems …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , , .

Regularization in Machine Learning: Concepts & Examples

In machine learning, regularization is a technique used to avoid overfitting. This occurs when a model learns the training data too well and therefore performs poorly on new data. Regularization helps to reduce overfitting by adding constraints to the model-building process. As data scientists, it is of utmost importance that we learn thoroughly about the regularization concepts to build better machine learning models. In this blog post, we will discuss the concept of regularization and provide examples of how it can be used in practice. What is regularization and how does it work? Regularization in machine learning represents strategies that are used to reduce the generalization or test error of …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

Difference: Binary, Multiclass & Multi-label Classification

Multilayer classifier to tag image with cat, dog, rooster and a donkey

There are three main types of classification algorithms when dealing with machine learning classification problems: Binary, Multiclass, and Multilabel. In this blog post, we will discuss the differences between them and how they can be used to solve different problems. Binary classifiers can only classify data into two categories, while multiclass classifiers can classify data into more than two categories. Multilabel classifiers assign or tag the data to zero or more categories. Let’s take a closer look at each type! Binary classification & examples Binary classification is a type of supervised machine learning problem that requires classifying data into two mutually exclusive groups or categories. The two groups can be …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

What is Machine Learning? Concepts & Examples

what is machine learning

Machine learning is a machine’s ability to learn from data. It has been around for decades, but machine learning is now being applied in nearly every industry and job function. In this blog post, we’ll cover a detailed introduction to what is machine learning including different definitions. We will also learn about different types of machine learning tasks, algorithms, etc along with real-world examples. What is machine learning & how does it work? Simply speaking, machine learning can be used to model our beliefs about real-world events. For example, let’s say a person came to a doctor with a certain blood report. A doctor based on his belief system learned …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , , .

Statistics – Random Variables, Types & Python Examples

probability-distribution-plot-of-discrete-random-variable

Random variables are one of the most important concepts in statistics. In this blog post, we will discuss what they are, their different types, and how they are related to the probability distribution. We will also provide examples so that you can better understand this concept. As a data scientist, it is of utmost importance that you have a strong understanding of random variables and how to work with them. What is a random variable and what are some examples? A random variable is a variable that can take on random values. The key difference between a variable and a random variable is that the value of the random variable …

Continue reading

Posted in Data Science, Python, statistics. Tagged with , , .

Frequentist vs Bayesian Probability: Difference, Examples

difference between bayesian and frequentist probability

In this post, you will learn about the difference between Frequentist vs Bayesian Probability.  It is of utmost importance to understand these concepts if you are getting started with Data Science. What is Frequentist Probability? Probability is used to represent and reason about uncertainty. It was originally developed to analyze the frequency of the events. In other words, the probability was developed as frequentist probability. The probability of occurrence of an event, when calculated as a function of the frequency of the occurrence of the event of that type, is called Frequentist Probability. Frequentist probability is a way of assigning probabilities to events that take into account how often those events actually occur. Frequentist …

Continue reading

Posted in Data Science. Tagged with .

Why & When to use Eigenvalues & Eigenvectors?

Eigenvector and Eigenvalues explained with example

In this post, you will learn about why and when you need to use Eigenvalues and Eigenvectors? As a data scientist/machine learning Engineer, one must need to have a good understanding of concepts related to Eigenvalues and Eigenvectors as these concepts are used in one of the most popular dimensionality reduction techniques – Principal Component Analysis (PCA). In PCA, these concepts help in reducing the dimensionality of the data (curse of dimensionality) resulting in a simpler model which is computationally efficient and provides greater generalization accuracy.   In this post, the following topics will be covered: Background – Why need Eigenvalues & Eigenvectors? What are Eigenvalues & Eigenvectors? When to use Eigenvalues …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

What are Features in Machine Learning?

Features - Key to Machine Learning

Machine learning is a field of machine intelligence concerned with the design and development of algorithms and models that allow computers to learn without being explicitly programmed. Machine learning has many applications including those related to regression, classification, clustering, natural language processing, audio and video related, computer vision, etc. Machine learning requires training one or more models using different algorithms. Check out this detailed post in relation to learning machine learning concepts – What is Machine Learning? Concepts & Examples. One of the most important aspects of the machine learning model is identifying the features which will help create a great model, the model that performs well on unseen data. …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

SVM Classifier using Sklearn: Code Examples

support vector machine classifier

In this post, you will learn about how to train an SVM Classifier using Scikit Learn or SKLearn implementation with the help of code examples/samples.  An SVM classifier, or support vector machine classifier, is a type of machine learning algorithm that can be used to analyze and classify data. A support vector machine is a supervised machine learning algorithm that can be used for both classification and regression tasks. The Support vector machine classifier works by finding the hyperplane that maximizes the margin between the two classes. The Support vector machine algorithm is also known as a max-margin classifier. Support vector machine is a powerful tool for machine learning and has been widely used …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Two sample Z-test for Proportions: Formula & Examples

two proportion z-test formula and examples

In statistics, a two-sample z-test for proportions is a method used to determine whether two samples are drawn from the same population. This test is used when the population proportion is unknown and there is not enough information to use the chi-squared distribution. The test uses the standard normal distribution to calculate the test statistic. As data scientists, it is important to know how to conduct this test in order to determine whether two proportions are equal. In this blog post, we will discuss the formula and examples of the two-proportion Z-test. What is two proportion Z-test? A two-proportion Z-test is a statistical hypothesis test used to determine whether two …

Continue reading

Posted in Data Science, statistics. Tagged with , .