Author Archives: Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Data Science vs Data Engineering Team – Have Both?

December 11, 2020 by Ajitesh Kumar · Leave a comment

Data engineering vs Data Science

In this post, you will learn about different aspects of data science and data engineering team and also understand the key differences between them. As data science / engineering stakeholders, it is very important to understand whether we need to have one or both the teams to achieve high quality dataset & data pipelines as well as high-performant machine learning models. Background When an organization starts on the journey of building data analytics products, primarily based on predictive analytics, it goes on to set up a centralized (mostly) data science team consisting of data scientists. The data science team works with the product team or multiple product teams to gather the …

Continue reading →

Posted in data engineering, Data Science. Tagged with data engineering, Data Science.

500+ Machine Learning Interview Questions

December 6, 2020 by Ajitesh Kumar · Leave a comment

machine learning interview questions

This post consists of all the posts on this website in relation to interview questions / quizzes related to data science / machine learning topics. These questions can prove to be helpful for the following: Product managers Data scientists Product Managers Interview Questions Find the questions for product managers on this page – Machine learning interview questions for product managers Data Scientists Interview Questions Here are posts representing 500+ interview questions which will be helpful for data scientists / machine learning engineers. You will find it useful as practise questions and answers while preparing for machine learning interview. Decision tree questions Machine learning validation techniques questions Neural networks questions – …

Continue reading →

Posted in Data Science, Interview questions, Machine Learning. Tagged with Data Science, Interview questions, machine learning.

Spacy Tokenization Python Example

December 4, 2020 by Ajitesh Kumar · Leave a comment

Spacy Tokenizer Python Example

In this post, you will quickly learn about how to use Spacy for reading and tokenising a document read from text file or otherwise. As a data scientist starting on NLP, this is one of those first code which you will be writing to read the text using spaCy. First and foremost, make sure you have got set up with Spacy, and, loaded English tokenizer. The following commands help you set up in Jupyter notebook. Reading text using spaCy: Once you are set up with Spacy and loaded English tokenizer, the following code can be used to read the text from the text file and tokenize the text into words. Pay attention …

Continue reading →

Posted in Data Science, NLP. Tagged with Data Science, nlp.

Top 10 Types of Analytics Projects – Examples

December 3, 2020 by Ajitesh Kumar · Leave a comment

Most common analytics projects

In this post, you will learn about some of the most common types of data analytics projects which can be executed by the organization to realise associated business value from analytics projects and, also, gain competitive advantage with respect to the related business functions. Note that analytics projects are different from AI / ML projects. AI / ML or predictive analytics is one part of analytics. Other types of analytics projects include those related with descriptive and prescriptive analytics. You may want to check out one of my related posts on difference between predictive and prescriptive analytics. Here are the key areas of focus for data analytics projects: Cost reduction: …

Continue reading →

Posted in Analytics. Tagged with analytics, data analytics.

Predictive vs Prescriptive Analytics Difference

December 2, 2020 by Ajitesh Kumar · 1 Comment

In this post, you will quickly learn about the difference between predictive analytics and prescriptive analytics. As data analytics stakeholders, one must get a good understanding of these concepts in order to decide when to apply predictive and when to make use of prescriptive analytics in analytics solutions / applications. Without further ado, let’s get straight to the diagram. In the above diagram, you could observe / learn the following: Predictive analytics: In predictive analytics, the model is trained using historical / past data based on supervised, unsupervised, reinforcement learning algorithms. Once trained, the new data / observation is input to the trained model. The output of the model is prediction in form …

Continue reading →

Posted in AI, Analytics, Machine Learning. Tagged with ai, data analytics, machine learning.

Negative Binomial Distribution Python Examples

November 24, 2020 by Ajitesh Kumar · Leave a comment

Negative Binomial Probability Distribution

In this post, you will learn about the concepts of negative binomial distribution explained using real-world examples and Python code. We will go over some of the following topics to understand negative binomial distribution: What is negative binomial distribution? What is difference between binomial and negative binomial distribution? Negative binomial distribution real-world examples Negative binomial distribution Python example What is Negative Binomial Distribution? Negative binomial distribution is a discrete probability distribution representing the probability of random variable, X, which is number of Bernoulli trials required to have r number of successes. This random variable is called as negative binomial random variable. And, the experiment representing X number of Bernoulli trials required to product r successes is called …

Continue reading →

Posted in statistics. Tagged with statistics.

NLTK – How to Read & Process Text File

November 22, 2020 by Ajitesh Kumar · 1 Comment

In this post, you will learn about the how to read one or more text files using NLTK and process words contained in the text file. As data scientists starting to work on NLP, the Python code sample for reading multiple text files from local storage will be very helpful. Python Code Sample for Reading Text File using NLTK Here is the Python code sample for reading one or more text files. Pay attention to some of the following aspects: Class nltk.corpus.PlaintextCorpusReader reader is used for reading the text file. The constructor takes input parameter such as corpus root and the regular expression representing the files. List of files that are read could be found using method such as fileids List …

Continue reading →

Posted in AI, NLP. Tagged with ai, nlp.

Geometric Distribution Explained with Python Examples

November 11, 2020 by Ajitesh Kumar · Leave a comment

In this post, you will learn about the concepts of Geometric probability distribution with the help of real-world examples and Python code examples. It is of utmost importance for data scientists to understand and get an intuition of different kinds of probability distribution including geometric distribution. You may want to check out some of my following posts on other probability distribution. Normal distribution explained with Python examples Binomial distribution explained with 10+ examples Hypergeometric distribution explained with 10+ examples In this post, the following topics have been covered: Geometric probability distribution concepts Geometric distribution python examples Geometric distribution real-world examples Geometric Probability Distribution Concepts Geometric probability distribution is a discrete …

Continue reading →

Posted in Data Science, statistics. Tagged with Data Science, statistics.

Top 10 Analytics Strategies for Great Data Products

November 9, 2020 by Ajitesh Kumar · Leave a comment

In this post, you will learn about the top 10 data analytics strategies which will help you create successful data products. These strategies will be helpful in case you are setting up a data analytics practice or center of excellence (COE). As an AI / Machine Learning / Data Science stakeholders, it will be important to understand these strategies in order to deliver analytics solution which creates business value having positive business impact. Here are the top 10 data analytics strategies: Identify top 2-3 business problems Identify related business / engineering organizations Create measurement plan by identifying right KPIs Identify analytics deliverables such as analytics reports, predictions etc Gather data …

Continue reading →

Posted in Analytics, Data Science, Machine Learning. Tagged with analytics, Data Science, machine learning.

Keras CNN Image Classification Example

November 6, 2020 by Ajitesh Kumar · Leave a comment

In this post, you will learn about how to train a Keras Convolution Neural Network (CNN) for image classification. Before going ahead and looking at the Python / Keras code examples and related concepts, you may want to check my post on Convolution Neural Network – Simply Explained in order to get a good understanding of CNN concepts. Keras CNN Image Classification Code Example First and foremost, we will need to get the image data for training the model. In this post, Keras CNN used for image classification uses the Kaggle Fashion MNIST dataset. Fashion-MNIST is a dataset of Zalando’s article images—consisting of a training set of 60,000 examples and a …

Continue reading →

Posted in Data Science, Deep Learning, Machine Learning. Tagged with Data Science, Deep Learning, keras, machine learning, python.

Data Quality Challenges for Machine Learning Models

November 3, 2020 by Ajitesh Kumar · Leave a comment

In this post, you will learn about some of the key data quality challenges which need to be dealt with in a consistent and sustained manner to ensure high quality machine learning models. Note that high quality models can be termed as models which generalizes better (lower true error with predictions) with unseen data or data derived from larger population. As a data science architect or quality assurance (QA) professional dealing with quality of machine learning models, you must learn some of these challenges and plan appropriate development processes to deal with these challenges. Here are some of the key data quality challenges which need to be tackled appropriately in …

Continue reading →

Posted in Data Science, Machine Learning, QA. Tagged with Data Science, machine learning, quality assurance.

Convolutional Neural Network (CNN) – Simply Explained

November 2, 2020 by Ajitesh Kumar · 1 Comment

In this post, you will learn about the basic concepts of convolutional neural network (CNN) explained with examples. As data scientists / machine learning / deep learning enthusiasts, you must get a good understanding of convolution neural network as there are many applications of CNN. Before getting into the details on CNN, let’s understand the meaning of Convolution in convolutional neural network. What’s Convolution? What’s intuition behind Convolution? Convolution represents a mathematical operation on two functions. As there can be applied different mathematical operations such as addition or multiplication on two different functions, in the similar manner, convolution operation can be applied on two different functions. Mathematically, the convolution of two different …

Continue reading →

Posted in Deep Learning. Tagged with Deep Learning.

Data Quality Assessment Frameworks – Machine Learning

November 1, 2020 by Ajitesh Kumar · Leave a comment

data quality assessment framework for machine learning

In this post, you will learn about data quality assessment frameworks / techniques in relation to machine learning and why one needs to assess data quality for building high-performance machine learning models? As a data science architect or development manager, you must get a sense of the importance of data quality in relation to building high-performance machine learning models. The idea is to understand what is the value of data set. The goal is to determine whether the value of data can be quantised. This is because it is important to understand whether the data contains rich information which could be valuable for building models and inform stakeholders on data …

Continue reading →

Posted in Data Science, Machine Learning.

Keras Neural Network for Regression Problem

October 30, 2020 by Ajitesh Kumar · Leave a comment

Keras Neural network for regression problem

In this post, you will learn about how to train neural network for regression machine learning problems using Python Keras. Regression problems are those which are related to predicting numerical continuous value based on input parameters / features. You may want to check out some of the following posts in relation to how to use Keras to train neural network for classification problems: Keras – How to train neural network to solve multi-class classification Keras – How to use learning curve to select most optimal neural network configuration for training classification model In this post, the following topics are covered: Design Keras neural network architecture for regression Keras neural network …

Continue reading →

Posted in Data Science, Deep Learning. Tagged with Deep Learning, keras, python.

Keras – Categorical Cross Entropy Loss Function

October 28, 2020 by Ajitesh Kumar · Leave a comment

Cross Entropy Loss Function

In this post, you will learn about when to use categorical cross entropy loss function when training neural network using Python Keras. Generally speaking, the loss function is used to compute the quantity that the the model should seek to minimize during training. For regression models, the commonly used loss function used is mean squared error function while for classification models predicting the probability, the loss function most commonly used is cross entropy. In this post, you will learn about different types of cross entropy loss function which is used to train the Keras neural network model. Cross entropy loss function is an optimization function which is used in case …

Continue reading →

Posted in Data Science, Deep Learning. Tagged with Deep Learning, keras.

Python Keras – Learning Curve for Classification Model

October 28, 2020 by Ajitesh Kumar · Leave a comment

Training & Validation Accuracy & Loss of Keras Neural Network Model

In this post, you will learn about how to train an optimal neural network using Learning Curves and Python Keras. As a data scientist, it is good to understand the concepts of learning curve vis-a-vis neural network classification model to select the most optimal configuration of neural network for training high-performance neural network. In this post, the following topics have been covered: Concepts related to training a classification model using a neural network Python Keras code for creating the most optimal neural network using a learning curve Training a Classification Neural Network Model using Keras Here are some of the key aspects of training a neural network classification model using Keras: …

Continue reading →

Posted in Data Science, Deep Learning, Machine Learning. Tagged with Deep Learning, keras, machine learning.

Welcome to Vitalflux.com - your hub for AI, Machine Learning, Data Science and Data Analytics topics. Learn through detailed, real-life examples in AI/ML and Data Management. Gain practical insights and apply them to real-world scenarios!

Data Science
Machine Learning
Deep Learning
Statistics
Generative AI

Courses
Admissions
Interview Questions
Educational Presentations

Privacy policy
Contact us

Analytics Yogi © 2025

Powered by WordPress. Design by WildWebLab