Category Archives: Data Science

PCA Explained Variance Concepts with Python Example

In this post, you will learn about the concepts of explained variance which is one of the key concepts related to principal component analysis (PCA). The explained variance concepts will be illustrated with Python code examples. Check out the concepts of Eigenvalues and Eigenvectors in this post – Why & when to use Eigenvalue and Eigenvectors. What is Explained Variance? Explained variance is a statistical measure of how much variation in a dataset can be attributed to each of the principal components (eigenvectors) generated by the principal component analysis (PCA) method. In very basic terms, it refers to the amount of variability in a data set that can be attributed to …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

One-hot Encoding Concepts & Python Examples

One-hot encoding concepts and python examples

In this post, you will learn about One-hot Encoding concepts and code examples using Python programming language. One-hot encoding is also called as dummy encoding. In this post, OneHotEncoder class of sklearn.preprocessing will be used in the code examples. As a data scientist or machine learning engineer, you must learn the one-hot encoding techniques as it comes very handy while training machine learning models. What is One-Hot Encoding? One-hot encoding is a process whereby categorical variables are converted into a form that can be provided as an input to machine learning models. It is an essential preprocessing step for many machine learning tasks. The goal of one-hot encoding is to …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Interns – Machine Learning Interview Questions & Answers: Set 1

interns machine learning interview questions and answers

This page lists down first set of machine learning / data science interview questions and answers for interns / freshers / beginners. If you are an intern or a fresher or a beginner in machine learning field, and, you are looking for some practice tests before appearing for your upcoming machine learning interview, these practice tests would prove to be very useful and handy. Machine Learning topics covered in Test In this set, some of the following topics have been covered: Machine learning fundamentals (Supervised and unsupervised learning algorithms) Different types of machine learning problems and related algorithms with examples Concepts related with regression, classification and clustering Practice Test (Questions …

Continue reading

Posted in Career Planning, Data Science, Freshers, Interview questions, Machine Learning. Tagged with , , , .

Data-centric vs Model-centric AI: Concepts, Examples

Data centric vs model-centric AI

There is a lot of discussion around AI and which approach is better: model-centric or data-centric. In this blog post, we will explore both approaches and give examples of each. We will also discuss the benefits and drawbacks of each approach. By the end of this post, you will have a better understanding of both AI approaches and be able to decide which one is right for your business! As product managers and data science architects, you should be knowledgeable about both of these AI approaches so that you can make informed decisions about the products and services you build. Model-centric approach to AI Model-centric approach to AI is about …

Continue reading

Posted in AI, Data, Data analytics, Data Science, Machine Learning. Tagged with , , .

Data Science Architect Interview Questions

interview questions

In this post, you will learn about interview questions that can be asked if you are going for a data scientist architect job. Data science architect needs to have knowledge in both data science/machine learning and cloud architecture. In addition, it also helps if the person is hands-on with programming languages such as Python & R. Without further ado, let’s get into some of the common questions right away. I will add further questions in the time to come. Q1. How do you go about architecting a data science or machine learning solution for any business problem? Solving a business problem using data science or machine learning based solution can …

Continue reading

Posted in Career Planning, Data Science, Enterprise Architecture, Interview questions, Machine Learning. Tagged with , , , .

Gartner Data Analytics Trends for 2022

Gartner data analytics trends 2022

Every year, Gartner releases a report on the latest data analytics trends that will be influential for businesses in the coming year. These reports are always insightful, and provide valuable information for companies who want to stay ahead of the curve. This year is no exception, and Gartner released their predictions for data analytics trends in earlier in 2022. In this blog post, we will take a look at some of the most important trends that Gartner has identified. Although it is a bit late to publish this post. However, this post discusses the concepts in detail and will be updated from time-to-time. Stay tuned for more insights into the …

Continue reading

Posted in Data, Data analytics, Data lake, Data Science. Tagged with .

Decision Science & Data Science – Differences, Examples

Decision science vs data science

Decision science and Data Science are two data-driven fields that have grown in prominence over the past few years. Data scientists use data to arrive at the truth by coming up with conclusions or predictions about things like customer behavior and assess suitability of those conclusions / predictions, while decision scientists combine data with other information sources to make decisions and assess suitability of those decisions for enterprise-wide adoption. The difference between data science and decision science is important for business owners to understand in clear manner in order to leverage the best of both worlds to achieve desired business outcomes. In this post, you will learn about the concepts …

Continue reading

Posted in AI, Analytics, Data Science, Decision Science. Tagged with , .

Sklearn SimpleImputer Example – Impute Missing Data

In this post, you will learn about how to use Python’s Sklearn SimpleImputer for imputing / replacing numerical & categorical missing data using different strategies. In one of the related article posted sometime back, the usage of fillna method of Pandas DataFrame is discussed. Handling missing values is key part of data preprocessing and hence, it is of utmost importance for data scientists / machine learning Engineers to learn different techniques in relation imputing / replacing numerical or categorical missing values with appropriate value based on appropriate strategies. SimpleImputer Python Code Example SimpleImputer is a class in the sklearn.impute module that can be used to replace missing values in a dataset, using a …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , , .

Pandas dropna: Drop Rows & Columns with Missing Values

pandas dropna method code sample

In this blog post, we will be discussing Pandas’ dropna method. This method is used for dropping rows and columns that have missing values. Pandas is a powerful data analysis library for Python, and the dropna function is one of its most useful features. As data scientists, it is important to be able to handle missing data, and Pandas’ dropna function makes this easy. Pandas dropna Method Pandas’ dropna function allows us to drop rows or columns with missing values in our dataframe. Find the documentation of Pandas dropna method on this page: pandas.DataFrame.dropna. The dropna method looks like the following: DataFrame.dropna(axis=0, how=’any’, thresh=None, subset=None, inplace=False) Given the above method and parameters, the following …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Spend Analytics Use Cases: AI & Data Science

What is spend analytics

In this post, you will learn about the high-level concepts of spend analytics in relation to procurement and how data science / machine learning & AI can be used to extract actionable insights as part of spend analytics. This will be useful for procurement professionals such as category managers, sourcing managers, and procurement analytics stakeholders looking to understand the concepts of spend analytics and how they can drive decisions based on spend analytics. What is Spend Analytics? Simply speaking, spend analytics is about performing systematic computational analysis to extract actionable insights from spend and savings data across different categories of spends in order to achieve desired business outcomes such as cost savings, …

Continue reading

Posted in Data Science, Machine Learning, Procurement. Tagged with , .

Perceptron Explained using Python Example

In this post, you will learn about the concepts of Perceptron with the help of Python example. It is very important for data scientists to understand the concepts related to Perceptron as a good understanding lays the foundation of learning advanced concepts of neural networks including deep neural networks (deep learning).  What is Perceptron? Perceptron is a machine learning algorithm which mimics how a neuron in the brain works. It is also called as single layer neural network consisting of a single neuron. The output of this neural network is decided based on the outcome of just one activation function associated with the single neuron. In perceptron, the forward propagation of information happens. Deep …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning, Python. Tagged with , , , .

Classification Problems Real-life Examples

classification problems real life examples

In this post, you will learn about some popular and most common real-life examples of machine learning classification problems. For beginner data scientists, these examples will prove to be helpful to gain perspectives on real-world problems which can be termed as machine learning classification problems. This post will be updated from time-to-time to include interesting real-life examples which can be solved by training machine learning classification models. Before going ahead and looking into examples, let’s understand a little about what is machine learning (ML) classification problem. You may as well skip this section if you are familiar with the definition of machine learning classification problems & solutions.  You may want …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Linear vs Non-linear Data: How to Know

Non-linear data set

In this post, you will learn the techniques in relation to knowing whether the given data set is linear or non-linear. Based on the type of machine learning problems (such as classification or regression) you are trying to solve, you could apply different techniques to determine whether the given data set is linear or non-linear. For a data scientist, it is very important to know whether the data is linear or not as it helps to choose appropriate algorithms to train a high-performance model. You will learn techniques such as the following for determining whether the data is linear or non-linear: Use scatter plot when dealing with classification problems Use …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , .

Neural Network Explained with Perceptron Example

Single layer neural network

Neural networks are an important part of machine learning, so it is essential to understand how they work. A neural network is a computer system that has been modeled based on a biological neural network comprising neurons connected with each other. It can be built to solve machine learning tasks, like classification and regression problems. The perceptron algorithm is a representation of how neural networks work. The artificial neurons were first proposed by Frank Rosenblatt in 1957 as models for the human brain’s perception mechanism. This post will explain the basics of neural networks with a perceptron example. You will understand how a neural network is built using perceptrons. This …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

Hypothesis Testing Steps & Real Life Examples

Hypothesis Testing Workflow

Hypothesis testing is a technique that helps scientists, researchers, or for that matter, anyone test the validity of their claims or hypotheses about real-world or real-life events. Hypothesis testing techniques are often used in statistics and data science to analyze whether the claims about the occurrence of the events are true, whether the results returned by performance metrics of machine learning models are representative of the models or they happened by chance. This blog post will cover some of the key statistical concepts including steps and examples in relation to what is hypothesis testing, and, how to formulate them. The knowledge of hypothesis formulation and hypothesis testing holds the key …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , , .

Insurance Machine Learning Use Cases

insurance machine learning use cases

As insurance companies face increasing competition and ever-changing customer demands, they are turning to machine learning for help. Machine learning / AI can be used in a variety of ways to improve insurance operations, from developing new products and services to improving customer experience. It would be helpful for product manager and data science architects to get a good understanding around some of the use cases which can be addressed / automated using machine learning / AI based solutions. In this blog post, we will explore some of the most common insurance machine learning / AI use cases. Stay tuned for future posts that will dive into each of these …

Continue reading

Posted in AI, Data Science, Insurance, Machine Learning, Product Management. Tagged with , , , .