Category Archives: AI

Precision & Recall Explained using Covid-19 Example

Model precision recall accuracy as function of Covid19

In this post, you will learn about the concepts of precision, recall, and accuracy when dealing with the machine learning classification model. Given that this is Covid-19 age, the idea is to explain these concepts in terms of a machine learning classification model predicting whether the patient is Corona positive or not based on the symptoms and other details. The following model performance concepts will be described with the help of examples.  What is the model precision? What is the model recall? What is the model accuracy? What is the model confusion matrix? Which metrics to use – Precision or Recall? Before getting into learning the concepts, let’s look at the data (hypothetical) derived out …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , .

Different Success / Evaluation Metrics for AI / ML Products

Success metrics for AI and ML products

In this post, you will learn about some of the common success metrics which can be used for measuring the success of AI / ML (machine learning) / DS (data science) initiatives / products. If you are one of the AI / ML stakeholders, you would want to get hold of these metrics in order to apply right metrics in right business use cases. Business leaders do want to know and maximise the return on investments (ROI) from AI / ML investments.  Here is the list of success metrics for AI / DS / ML initiatives: Business value metrics / Key performance indicators (KPIs): Business value metrics such as operating …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , .

Predictive vs Prescriptive Analytics Difference

In this post, you will quickly learn about the difference  between  predictive analytics and prescriptive analytics. As data analytics stakeholders, one must get a good understanding of these concepts in order to decide when to apply predictive and when to make use of prescriptive analytics in analytics solutions / applications. Without further ado, let’s get straight to the diagram.  In the above diagram, you could observe / learn the following: Predictive analytics: In predictive analytics, the model is trained using historical / past data based on supervised, unsupervised, reinforcement learning algorithms. Once trained, the new data / observation is input to the trained model. The output of the model is prediction in form …

Continue reading

Posted in AI, Analytics, Machine Learning. Tagged with , , .

NLTK – How to Read & Process Text File

In this post, you will learn about the how to read one or more text files using NLTK and process words contained in the text file. As data scientists starting to work on NLP, the Python code sample for reading multiple text files from local storage will be very helpful.  Python Code Sample for Reading Text File using NLTK Here is the Python code sample for reading one or more text files. Pay attention to some of the following aspects: Class nltk.corpus.PlaintextCorpusReader reader is used for reading the text file. The constructor takes input parameter such as corpus root and the regular expression representing the files. List of files that are read could be found using method such as fileids List …

Continue reading

Posted in AI, NLP. Tagged with , .

10 Key Challenges for AI / ML Projects Implementation

Challenges related to Machine Learning Projects Implementations

In this post, you will learn about some of the key challenges in relation to achieving successful AI / ML projects implementation in a consistent and sustained manner. As AI / ML project stakeholders including senior management stakeholders, data science architects, product managers etc, you must get a good understanding of what would it take to successfully execute AI / ML projects and create value for the customers and the business.  Either you are building AI / ML products or enabling unique models for your clients in SaaS setup, you will come across most of these challenges.  Here are some of the key challenges: Whether a machine learning solution is …

Continue reading

Posted in AI, Machine Learning. Tagged with , .

Python – Extract Text from HTML using BeautifulSoup

Extracting Text from HTML Pages

In this post, you will learn about how to use Python BeautifulSoup and NLTK to extract words from HTML pages and perform text analysis such as frequency distribution. The example in this post is based on reading HTML pages directly from the website and performing text analysis. However, you could also download the web pages and then perform text analysis by loading pages from local storage. Python Code for Extracting Text from HTML Pages Here is the Python code for extracting text from HTML pages and perform text analysis. Pay attention to some of the following in the code given below: URLLib request is used to read the html page …

Continue reading

Posted in AI, Data Science, NLP, Python. Tagged with , , .

Top 10 Data Science Skills for Product Managers

Top 10 data science skills for product managers

In this post, you will learn about some of the top data science skills / concepts which may be required for product managers / business analyst to have, in order to create useful machine learning based solutions. Here are some of the topics / concepts which need to be understood well by product managers / business analysts in order to tackle day-to-day challenges while working with data science / machine learning teams. Knowing these concepts will help product managers / business analyst acquire enough skills in order to solve machine learning based problems. Understanding the difference between AI, machine learning, data science, deep learning Which problems are machine learning problems? …

Continue reading

Posted in AI, Data Science, Machine Learning, Product Management.

Python – Extract Text from PDF file using PDFMiner

In this post, you will get a quick code sample on how to use PDFMiner, a Python library, to extract text from PDF files and perform text analysis. I will be posting several other posts in relation to how to use other Python libraries for extracting text from PDF files.  In this post, the following topic will get covered: How to set up PDFMiner Python code for extracting text from PDF file using PDFMiner Setting up PDFMiner Here is how you would set up PDFMiner.six. You could execute the following command to get set up with PDFMiner while working in Jupyter notebook: Python Code for Extracting Text from PDF file …

Continue reading

Posted in AI, NLP, Python. Tagged with , , .

NLTK Hello World Python Example

In this post, you will learn about getting started with natural language processing (NLP) with NLTK (Natural Language Toolkit), a platform to work with human languages using Python language. The post is titled hello world because it helps you get started with NLTK while also learning some important aspects of processing language. In this post, the following will be covered: Install / Set up NLTK Common NLTK commands for language processing operations Install / Set up NLTK This is what you need to do set up NLTK. Make sure you have Python latest version set up as NLTK requires Python version 3.5, 3.6, 3.7, or 3.8 to be set up. In Jupyter notebook, you could execute …

Continue reading

Posted in AI, NLP. Tagged with , , .

8 Key AI Challenges for Telemedicine / Telehealth

In this post, you will learn about some of key challenges of implementing Telemedicine / Telehealth. In case you are working in the field of data science / machine learning, you may want to go through some of the challenges, primarily AI related, which is thrown in Telemedicine domain due to upsurge in need of reliable Telemedicine services. Here are the slides I recently presented in Digital Data Science Conclave hosted by KIIT University.  The primary focus is to make sure appropriate controls are in place to make responsible use of AI (Responsible AI). Here are the top 8 challenges which need to be addressed to take full advantage of AI, RPA …

Continue reading

Posted in AI, Data Science, Healthcare, Machine Learning, Telemedicine. Tagged with , , , , , .

Random Forest Classifier Python Code Example

Random forest classifier using python sklearn library

In this post, you will learn about how to train a Random Forest Classifier using Python Sklearn library. This code will be helpful if you are a beginner data scientist or just want to quickly get code sample to get started with training a machine learning model using Random Forest algorithm. The following topics will be covered: Brief introduction of Random Forest Python code example for training a random forest classifier Brief Introduction to Random Forest Classifier Random forest can be considered as an ensemble of several decision trees. The idea is to aggregate the prediction outcome of multiple decision trees and create a final outcome based on averaging mechanism …

Continue reading

Posted in AI, Data Science, Machine Learning, Python. Tagged with , , .

Decision Tree Classifier Python Code Example

Decision tree decision boundaries

In this post, you will learn about how to train a decision tree classifier machine learning model using Python. The following points will be covered in this post: What is decision tree? Decision tree python code sample What is Decision Tree? Simply speaking, the decision tree algorithm breaks the data points into decision nodes resulting in a tree structure. The decision nodes represent the question based on which the data is split further into two or more child nodes. The tree is created until the data points at a specific child node is pure (all data belongs to one class). The criteria for creating the most optimal decision questions is …

Continue reading

Posted in AI, Data Science, Machine Learning, Python. Tagged with , , .

SVM RBF Kernel Parameters with Code Examples

SVM RBF Kernel Parameters - Gamma and C values

In this post, you will learn about SVM RBF (Radial Basis Function) kernel hyperparameters with the python code example.  The following are the two hyperparameters which you need to know while training a machine learning model with SVM and RBF kernel: Gamma  C (also called regularization parameter) Knowing the concepts on SVM parameters such as Gamma and C used with RBF kernel will enable you to select the appropriate values of Gamma and C and train the most optimal model using the SVM algorithm.  Let’s understand why we should use kernel functions such as RBF. Why use RBF Kernel? When the data set is linearly inseparable or in other words, the …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , .

Machine Learning – SVM Kernel Trick Example

In this post, you will learn about what are kernel methods, kernel trick, and kernel functions when referred with a Support Vector Machine (SVM) algorithm. A good understanding of kernel functions in relation to the SVM machine learning (ML) algorithm will help you build/train the most optimal ML model by using the appropriate kernel functions. There are out-of-box kernel functions such as some of the following which can be applied for training models using the SVM algorithm: Polynomial kernel Gaussian kernel Radial basis function (RBF) kernel Sigmoid kernel The following topics will be covered: Background – Why Kernel concept? What is a kernel method? What is the kernel trick? What are …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , .

How to Know if Data is Linear or Non-linear

Non-linear data set

In this post, you will learn the techniques in relation to knowing whether the given data set is linear or non-linear. Based on the type of machine learning problems (such as classification or regression) you are trying to solve, you could apply different techniques to determine whether the given data set is linear or non-linear. For a data scientist, it is very important to know whether the data is linear or not as it helps to choose appropriate algorithms to train a high-performance model. You will learn techniques such as the following for determining whether the data is linear or non-linear: Use scatter plot when dealing with classification problems Use …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , .

Sklearn SVM Classifier using LibSVM – Code Example

In this post, you learn about Sklearn LibSVM implementation used for training an SVM classifier, with code example.  Here is a great guide for learning SVM classification, especially, for beginners in the field of data science/machine learning. LIBSVM is a library for Support Vector Machines (SVM) which provides an implementation for the following: C-SVC (Support Vector Classification) nu-SVC epsilon-SVR (Support Vector Regression) nu-SVR Distribution estimation (one-class SVM) In this post, you will see code examples in relation to C-SVC, and nu-SVC LIBSVM implementations. I will follow up with code examples for SVR and distribution estimation in future posts. Here are the links to their SKLearn pages for C-SVC and nu-SVC …

Continue reading

Posted in AI, Data Science, Machine Learning, Python. Tagged with , , .