# Category Archives: Data Science

## QA of Machine Learning Models with PDCA Cycle

The primary goal of establishing and implementing Quality Assurance (QA) practices for machine learning/data science projects or, projects using machine learning models is to achieve consistent and sustained improvements in business processes making use of underlying ML predictions. This is where the idea of PDCA cycle (Plan-Do-Check-Act) is applied to establish a repeatable process ensuring that high-quality machine learning (ML) based solutions are served to the clients in a consistent and sustained manner. The following diagram represents the details. The following represents the details listed in the above diagram. Plan Explore/describe the business problems: In this stage, product managers/business analyst sit with data scientist and discuss the business problem at hand. The outcome of this …

## QA & Data Science – How to Test Features Relevance

In this post, I intend to present a perspective on the need for QA / testing team to test the feature relevance when testing the machine learning models as part of data science QA initiatives, and, different techniques which could be used to test or perform QA on feature relevance. Feature relevance can also be termed as feature importance. Simply speaking, a feature is said to be relevant or important if it adds real predictive value to the underlying model. The relevant features must display a stable statistical relationship or association with the outcome variable. Well, an association does not imply a causation. However, a relevant feature or a feature …

## Quality Assurance / Testing the Machine Learning Model

This is the first post in the series of posts related to Quality Assurance & Testing Practices and Data Science / Machine Learning Models which I would release in next few months. The goal of this and upcoming posts would be to create a tool and framework which could help you design your testing/QA practices around data science/machine learning models. Why QA Practices for testing Machine Learning Models? Are you a test engineer and want to know about how you could make difference in AI initiative being undertaken by your current company? Are you a QA manager and looking for or researching tools and frameworks which could help your team perform QA with …

## Data Science – P-Value Explained with Examples

Are you one of the data science/machine learning beginners who wants to learn about P-Value using some examples? Are you one of those who has been hunting different web pages to understand P-Value in a simpler and easier manner? This post is aimed to present P-VALUE concepts with multiple different examples. The following use cases and related hypothesis made about the population will either be accepted or rejected based on the P-VALUE: Whether a coin is fair Whether a dice is fair What is P-VALUE? P-value can be defined as the probability of obtaining a sample “more extreme” than the ones observed in the sample data used for hypothesis testing. It is …

## Difference between Frequentist vs Bayesian Probability

In this post, you will learn about the difference between Frequentist vs Bayesian Probability. It is of utmost important to understand these concepts if you are getting started with Data Science. What is Frequentist Probability? The probability of occurrence of an event, when calculated as a function of the frequency of the occurrence of the event of that type, is called as Frequentist Probability. For example, the probability of rolling a dice (having 1 to 6 number) and getting a number 3 can be said to be Frequentist probability. Consider another example of head occurring as a result of tossing a coin. Note that the Frequentist frequencies can be calculated by conducting the experiment in …

## AI – Three Different types of Machine Learning Algorithms

This post is aimed to help you learn different types of machine learning algorithms which forms the key to artificial intelligence (AI). Machine learning algorithms Representation or Feature learning algorithms Deep learning algorithms The following represents different types of learning algorithms in form of a Venn diagram. What are Machine Learning (ML) Algorithms? Machine learning algorithms are the most simplistic class of algorithms when talking about AI. ML algorithms are based on the idea that external entities such as business analysts and data scientists need to work together to identify the features set for building the model. The ML algorithms are, then, trained to come up with coefficients for each of the features and how are they …

## 8 Machine Learning Javascript Frameworks to Explore

Javascript developers tend to look out for Javascript frameworks which can be used to train machine learning models based on different machine learning algorithms. The following are some of the machine learning algorithms using which models can be trained using different javascript frameworks listed in this article: Simple linear regression Multi-variate linrear regression Logistic regression Naive-bayesian K-nearest neighbour (KNN) K-means Support vector machine (SVM) Random forest Decision tree Feedforward neural network Deep learning network In this post, you will learn about different Javascsript framework for machine learning. They are some of the following: Deeplearn.js Propel ConvNetJS ML-JS KerasJS STDLib Limdu.js Brain.js DeepLearn.js Deeplearn.js is an open-source machine learning Javascript library …

## Machine Learning – Validation Techniques (Interview Questions)

Validation techniques in machine learning are used to get the error rate of the ML model which can be considered as close to the true error rate of the population. In case the data volume is large enough to be representative of the population, you may not need the validation techniques. However, in real world scenario, we work with the sample of data which may not be the true representative of the population. This is where validation techniques come into the picture. In this post, you will briefly learn about different validation techniques such as following and also presented with practice test having questions and answers which could be used …

## Dummies Notes – Supervised vs Unsupervised Learning

Broadly speaking, Machine learning problems can be classified into three different types such as following: Supervised learning Unsupervised learning Reinforcement learning In this post, you will visually learn about supervised and unsupervised learning. Supervised vs Unsupervised Learning The following is self-explanatory picture representing what is supervised and unsupervised learning techniques and how are they different. Pay attention to some of the following: Supervised learning: In supervised learning problems, predictive models are created based on input set of records with output data (numbers or labels). Based on the outcome/response or dependent variable, supervised learning problems can be further divided into two different kinds: Regression: When the outcome or response variable is a continuous …

## Data Science – What are Machine Learning (ML) Models?

Machine learning (ML) models is the most commonly used in a data science project. In this post, you will learn about different definitions of a machine learning model to get a better understanding of what are machine learning models? A model is the relationship between features and the label. (Tensorflow – Getting Started for ML Beginners) An ML model is a mathematical model that generates predictions by finding patterns in your data. (AWS ML Models) ML Models generate predictions using the patterns extracted from the input data (Amazon Machine learning – Key concepts) Learning in the supervised model entails creating a function that can be trained by using a training …

## 10+ Key Stages of Data Science Project Life cycle

Data science projects need to go through different project lifecycle stages in order to become successful. In each of the stages, different stakeholders get involved as like in a traditional software development lifecycle. In this post, you will learn some of the key stages/milestones of data science project lifecycle. This article is aimed to help some of the following project stakeholders who play key roles in data science project implementation: Product managers Project managers ML architects The following represents 6 high-level stages of data science project lifecycle: Planning Model development & testing Product-level changes Model deployment Monitoring the model Model Enhancement Data Science Project Lifecycle – Planning ML Problem identification: …

## Decision Tree Algorithm – Concepts, Interview Questions

Decision tree is one of the most commonly used machine learning algorithms which can be used for solving both classification and regression problems. It is very simple to understand and use. Here is a lighter one representing how decision trees and related algorithms (random forest etc) are agile enough for usage. In this post, you will learn about some of the following in relation to machine learning algorithm – decision trees vis-a-vis one of the popular C5.0 algorithm used to build a decision tree for classification. In another post, we shall also be looking at CART methodology for building a decision tree model for classification. Key terminologies/definitions Key concepts Sample …

## Tutorials – Building Machine Learning Models for Predicting Cancer

In this article, I would introduce different aspects of the building machine learning models to predict whether a person is suffering from malignant or benign cancer while emphasizing on how machine learning can be used (predictive analysis) to predict cancer disease, say, Mesothelioma Cancer. The approach such as below can as well be applied to any other diseases including different types of cancers. Predicting Mesothelioma Cancer – Supervised Learning Problem Machine learning problems are classified into different kinds of learning problem. Most important of them are following: Supervised learning Unsupervised learning Supervised Learning In supervised learning, you have a history of data with each record being labeled. Thus, in case of predictive analysis of Mesothelioma cancer, there is …

## Top 8 Neural Networks and Deep Learning Tutorials

Here is a list of top 8 neural networks tutorials (web pages) for getting started on neural networks and deep learning. Introduction to Deep Neural Networks Neural Networks and Deep Learning: Free online book to learn concepts related with neural networks and deep learning. Very good for beginners. Concepts explained using Handwritten digits. The book is authored by Michael Nielsen. Neural Networks: The page explains and demonstrates various types of neural networks along with applications of neural networks like ANNs in medicine. Coursera Course on Neural Networks for Machine Learning: This can be used to learn fundamentals related with artificial neural networks and how they’re being used for machine learning, …

## Neural Networks Interview Questions – Set 1

This page represents practice test consisting of objective questions on neural networks. This test can prove to be useful for interviews as well. These questions can prove to be useful for machine learning interns / freshers / beginners. These questions are related with some of the following topics: Introduction to neural networks Perceptron / Sigmoid neuron Types of neural networks Cost function for neural networks Practice Test on Neural Networks

## Support Vector Machine (SVM) Interview Questions – Set 1

This quiz consists of questions and answers on Support Vector Machine (SVM). This is a practice test (objective questions and answers) which can be useful when preparing for interviews. The questions in this and upcoming practice tests could prove to be useful, primarily, for data scientist or machine learning interns / freshers / beginners. The questions are focused around some of the following areas: Introduction to SVM Types of SVM such as maximum-margin classifier, soft-margin classifier, support vector machine Some of the key SVM concepts to understand while preparing for interview are following: SVM concepts and objective functions SVM kernel functions, tricks Concepts of C and Gamma value Scikit learn libraries …