Author Archives: Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking. Check out my other blog, Revive-n-Thrive.com

Testing Machine Learning Models on Dual Coding Principles

Automation of Dual Coding Testing of ML Models

This post intends to propose a technique termed as Dual Coding for testing or performing quality control checks on machine learning models from quality assurance (QA) perspective. This could be useful in performing black box testing of ML models. The proposed technique is based on the principles of Dual Coding Theory (DCT) hypothesized by Allan Paivio of the University of Western Ontario in 1971. According to Dual Coding Theory, our brain uses two different systems including verbal and non-verbal/visual to the gather, process, store and retrieve (recall) the information related to a particular subject. One of the key assumptions of dual coding theory is the connections (also termed as referential …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

QA – Blackbox Testing for Machine Learning Models

blackbox testing

Data science/Machine learning career has primarily been associated with building models which could do numerical or class-related predictions. This is unlike conventional software development which is associated with both development and “testing” the software. And, the related career profiles are software developer/engineers and test engineers/QA professional. However, in the case of machine learning, the career profile is a data scientist. The usage of the word “testing” in relation to machine learning models is primarily used for testing the model performance in terms of accuracy/precision of the model. It can be noted that the word, “testing”, means different for conventional software development and machine learning models development. Machine learning models would …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

Assessing Quality of AI Models from QA Standpoint

Quality of Machine Learning Models

In this post, you will learn about the definition of quality of AI / machine learning (ML) models. Getting a good understanding of what is the high and low quality of AI models would help you design quality control checks for testing machine learning models and related quality assurance (QA) practices. This post would be a good read for QA professionals in general. However, it would also help set perspectives for data scientists and machine learning experts. The following are some of the key quality traits which are described in detail for assessing the quality of AI models: Functional suitability Maintainability Usability Efficiency Security Portability When designing QA practice and related quality control checks, all of the above would need to be considered for testing …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

QA – Metamorphic Testing for Machine Learning Models

Metamorphic Relations for Machine Learning Models QA

In this post, you will learn about how metamorphic testing could be used for performing quality control checks/testing on machine learning models. The post is primarily meant for data science (QA) specialists to plan the test cases to test the machine learning (ML) model implementation from QA perspective. Testing machine learning models from a quality assurance perspective is different from testing machine learning models for accuracy/performance. The word “testing” is one of the conflicting technical nomenclatures given its usage by machine learning experts and software engineering community in general. In this post, the following topics are discussed: Introduction to metamorphic testing Why metamorphic testing for machine learning models? Automated metamorphic testing of ML models Introduction …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

QA – Why Machine Learning Systems are Non-testable

non-testability-of-machine-learning-systems

This post represents views on why machine learning systems or models are termed as non-testable from quality control/quality assurance perspectives. Before I proceed ahead, let me humbly state that data scientists/machine learning community has been saying that ML models are testable as they are first trained and then tested using techniques such as cross-validation etc., based on different techniques to increase the model performance, optimize the model.  However, “testing” the model is referred with the scenario during the development (model building) phase when data scientists test the model performance by comparing the model outputs (predicted values) with the actual values.  This is not the same as testing the model for any given input for which the …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

QA – Testing Features of Machine Learning Models

Testing Features of Machine Learning Models

In this post, you will learn about different types of test cases which you could come up for testing features of the data science/machine learning models. Testing features are one of the key set of QA tasks which needed to be performed for ensuring the high performance of machine learning models in a consistent and sustained manner. Features make the most important part of a machine learning model. Features are nothing but the predictor variable which is used to predict the outcome or response variable. Simply speaking, the following function represents y as the outcome variable and x1, x2 and x1x2 as predictor variables. y = a1x1 + a2x2 + a3x1x2 + e In the above function, …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

QA of Machine Learning Models with PDCA Cycle

QA and Machine learning Projects with PDCA Cycle

The primary goal of establishing and implementing Quality Assurance (QA) practices for machine learning/data science projects or, projects using machine learning models is to achieve consistent and sustained improvements in business processes making use of underlying ML predictions. This is where the idea of PDCA cycle (Plan-Do-Check-Act) is applied to establish a repeatable process ensuring that high-quality machine learning (ML) based solutions are served to the clients in a consistent and sustained manner. The following diagram represents the details. The following represents the details listed in the above diagram. Plan Explore/describe the business problems: In this stage, product managers/business analyst sit with data scientist and discuss the business problem at hand. The outcome of this …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

QA & Data Science – How to Test Features Relevance

how to test feature relevance in data science

In this post, I intend to present a perspective on the need for QA / testing team to test the feature relevance when testing the machine learning models as part of data science QA initiatives, and, different techniques which could be used to test or perform QA on feature relevance. Feature relevance can also be termed as feature importance. Simply speaking, a feature is said to be relevant or important if it adds real predictive value to the underlying model. The relevant features must display a stable statistical relationship or association with the outcome variable. Well, an association does not imply a causation. However, a relevant feature or a feature …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

Quality Assurance / Testing the Machine Learning Model

QA Framework for testing Machine Learning Models

This is the first post in the series of posts related to Quality Assurance & Testing Practices and Data Science / Machine Learning Models which I would release in next few months. The goal of this and upcoming posts would be to create a tool and framework which could help you design your testing/QA practices around data science/machine learning models. Why QA Practices for testing Machine Learning Models? Are you a test engineer and want to know about how you could make difference in AI initiative being undertaken by your current company? Are you a QA manager and looking for or researching tools and frameworks which could help your team perform QA with …

Continue reading

Posted in Data Science, Machine Learning, QA, Testing. Tagged with , , , .

Is Blockchain a Database?

is blockchain a database

This video represents a comparison/difference between Blockchain and a traditional database form. Some of the following aspects of Blockchain is highlighted in the comparison: Blockchain copy maintained by the different member organizations Decentralized ownership Data immutability Consensus algorithm Transaction anonymity Users anonymity

Posted in BlockChain. Tagged with .

Blockchain – Opportunities & Risks for Financial Institutions

blockchain lessons for MBA students

Here is a good white paper/research report published by European Banking Authority (EBA) detailing out the opportunities and risks associated with the adoption of Blockchain in relation to financial institutions. Report on risks and opportunities arising for financial institutions due to FinTech Use cases related to some of the following are described: Usage of distributed ledger technology (DLT) and smart contracts for trade finance Use of DLT to streamline customer due diligence (CDD) processes In this use case, the “digital identity” concept is explored and described.  

Posted in BlockChain, News. Tagged with .

MongoDB Commands Cheat Sheet for Beginners

mongodb cheat sheet for beginners

In this post, you will learn about MongoDB commands which could get you started and perform minimum database related activities such as create, update, drop a collection (table). These commands are ideally meant for MongoDB beginners and could be taken as the cheat sheet. You may want to bookmark this page for quick reference. MongoDB Commands Cheatsheet The following is the list of the commands: Start and stop the MongoDB Database Access the MongoDB database using Shell Show all databases Create a database, say, testdb; Switch to the database Until a collection is created in a database, the database name is not listed as a result of execution of the command, “show dbs” Add a …

Continue reading

Posted in MongoDB, NoSQL. Tagged with , .

Blockchain – How to Store Documents or Files

Store Documents or Files in Blockchain Network

This post represents best practices in relation to storing documents or files in the Blockchain. The need to determine best practices arises from the fact that business between two or more parties ends up exchanging documents consisting of data related to the agreement, business details etc. One often questions whether one store’s document or file in the blockchain or the hash of the document/file in the Blockchain. Store documents in Blockchain – Best Practice/Recommendation As a best practice, it is not recommended to store the document (PDF format or otherwise) or file in the Blockchain. Different blockchain frameworks limit the size of the block which can be added to the blockchain. Although in blockchain network …

Continue reading

Posted in BlockChain. Tagged with .

Bitcoin Blockchain – What is Proof of Work?

Proof of Work in Bitcoin Blockchain

In this post, you will learn about what is proof of work in a Bitcoin Blockchain.  Simply speaking, the proof of work in computing is used to validate whether the user has put enough effort (such as computing power) or done some work for solving a mathematical problem of a given complexity, before sending the request. For example, Go to any online SHA 256 calculator tool and try using a random number with the text “Hello World” as shown in the below screenshot. If the difficulty target is set as the hash value starting with one zero, you could see that the random number 10 results in hash value starting with zero. The number 10 …

Continue reading

Posted in bitcoin, BlockChain. Tagged with .

Is Blockchain a Linked List like Data Structure?

Blockchain represented as Linked List Data Structure

In this post, you will learn about similarity and differences between linked list and Blockchain. The most trivial way to understand What is Blockchain is to visualize Blockchain as a crude form of the Linked List data structure that we read in one of our engineering classes. Simply speaking, a Blockchain can be defined as a linked list of a group of transactions (block) which is connected with each other using hash pointers rather than pointers as in the case of the linked list. The following diagram represents the Linked List data structure: The following diagram represents the Blockchain. Note some of the following characteristics of a block: Each block …

Continue reading

Posted in BlockChain, Data Structure. Tagged with .

Learning Blockchain – Free Online Courses & Training – 1

learning blockchain technology

This is the first blog representing a series of posts on enabling you to learn Blockchain online. Such posts would also represent some interesting projects for you to learn the different aspects of Blockchain implementation. This post is aimed to represent some links in relation to learning Blockchain concepts vis-a-vis free online courses, training. (Learning) Center for Blockchain Research (CBR) by Stanford: The following are free textbook and Coursera course on Cryptography: Free online Cryptography course on Coursera Free online textbook on applied cryptography (Case Study) Distributed Contracting Blockchain Network by Microsoft & EY: The diagram below demonstrates the following different aspects of blockchain: Permissioned blockchain registering entertainment, gaming industry …

Continue reading

Posted in BlockChain, Career Planning, Tutorials. Tagged with .