Author Archives: Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Data Warehouse vs. Data Lake – Differences, Examples

data warehouse vs data lake

When it comes to data storage, there are two distinct types of solutions that you can use—a data warehouse and a data lake. Both of these solutions have their own benefits, but it’s important to understand the key differences between them so that you can choose the best option for your needs. Let’s take a closer look at what makes each solution unique.  What is a Data Warehouse? A data warehouse is defined as an electronic storage system used for reporting and analysis. Data warehouses store data in a structured (row-column) format. It typically contains aggregated collections of data from multiple sources, which come together in one database. A data warehouse …

Continue reading

Posted in Data, Data lake, Data Science, Data Warehouse. Tagged with , , .

Different types of Clustering in Machine Learning

Different types of clustering

Clustering is a type of unsupervised machine learning technique that is used to group data points into distinct categories or clusters. It is one of the most widely used techniques in machine learning and can be used for various tasks such as grouping customers by their buying habits, creating groups of similar documents, or finding groups of related genes. In this blog post, we will explore different types / categories of clustering methods and discuss why they are so important in the field of machine learning. Prototype-based Clustering Prototype based clustering represents one of the categories of clustering algorithms that are used to identify groups within a larger dataset. This …

Continue reading

Posted in Machine Learning. Tagged with , , .

Python Pickle Example: What, Why, How

python pickle file example

Have you ever heard of the term “Python Pickle“? If not, don’t feel bad—it can be a confusing concept. However, it is a powerful tool that all data scientists, Python programmers, and web application developers should understand. In this article, we’ll break down what exactly pickling is, why it’s so important, and how to use it in your projects. What is Python Pickle? In its simplest form, pickling is the process of converting any object into a byte stream (a sequence of bytes). This byte stream can then be transmitted over a network or stored in a file for later use. It’s like putting the object into an envelope and …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Designing & Building Data Products – Best Practices

designing and building data products - best practices

For those in the analytics industry, designing and building data products is a critical part of the job. It’s important to understand how to design and build data products that are useful, efficient, effective and loved by the end customers. In this blog post, we will discuss some best practices for designing and developing innovative data products. It’s important to keep these best practices in mind when developing data products / solutions as they can help ensure your product is successful. Call out Decision – Action – Outcome Hypothesis It is important to call out decision-action-outcome hypotheses when building data products because it serves as a blueprint for designing, testing …

Continue reading

Posted in Data, Product Management. Tagged with , .

Top 10 Basic Computer Science Topics to Learn

computer architecture - basic computer topics to learn

Computer science is an expansive field with a variety of areas that are worth exploring. Whether you’re just starting out or already have some experience in computer science, there are certain topics that every aspiring software engineer should understand. This blog post will cover the basic computer science topics that are essential for any software engineer or software programmer to know. Computer Architecture Computer architecture is a course of study that explores the fundamental elements of computer building and design. It’s an important field of study for software engineers to understand, since it provides basic principles and concepts related to hardware and software interactions. Computer architecture courses typically cover a …

Continue reading

Posted in Data Science, Software Engg.

Free Datasets for Machine Learning & Deep Learning

dataset publicly_available free machine learning

Are you looking for free / popular datasets to use for your machine learning or deep learning project? Look no further! In this blog post, we will provide an overview of some of the best free datasets available for machine learning and deep learning. These datasets can be used to train and evaluate your models, and many of them contain a wealth of valuable information that can be used to address a wide range of real-world problems. So, let’s dive in and take a look at some of the top free datasets for machine learning and deep learning! Here is the list of free data sets for machine learning & …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

Challenges for Machine Learning / AI Projects

Challenges related to Machine Learning Projects Implementations

In this post, you will learn about some of the key challenges in relation to achieving successful AI / machine learning (ML) or Data science projects implementation in a consistent and sustained manner. As AI / ML project stakeholders including senior management stakeholders, data science architects, product managers, etc, you must get a good understanding of what would it take to successfully execute AI / ML projects and create value for the customers and the business.  Whether you are building AI / ML products or enabling unique models for your clients in SaaS setup, you will come across most of these challenges.  Understanding the Business Problem Many times, the nature …

Continue reading

Posted in AI, Machine Learning. Tagged with , .

Difference between Online & Batch Learning

online learning - machine learning system

In this post, you will learn about the concepts and differences between online and batch or offline learning in relation to how machine learning models in production learn incrementally from the stream of incoming data or otherwise. It is one of the most important aspects of designing machine learning systems. Data science architects would require to get a good understanding of when to go for online learning and when to go for batch or offline learning. Why online learning vs batch or offline learning? Before we get into learning the concepts of batch and on-line or online learning, let’s understand why we need different types of models training or learning …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Most Common Machine Learning Tasks

common machine learning tasks

This article represents some of the most common machine learning tasks that one may come across while trying to solve machine learning problems. Also listed is a set of machine learning methods that could be used to resolve these tasks. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos. You might want to check out the post on what is machine learning?. Different aspects of machine learning concepts have been explained with the help of examples. Here is an excerpt from the page: Machine learning is about approximating mathematical functions (equations) representing real-world scenarios. These mathematical functions are also referred …

Continue reading

Posted in AI, Big Data, Data Science, Machine Learning. Tagged with , .

Moving Average Method for Time-series forecasting

Moving average definition & examples

In this post, you will learn about the concepts of the moving average method in relation to time-series forecasting. You will get to learn Python examples in relation to training a moving average machine learning model.  The following are some of the topics which will get covered in this post: What is the moving average method? Why use the moving average method? Python code example for the moving average methods What is Moving Average method? The moving average is a statistical method used for forecasting long-term trends. The technique represents taking an average of a set of numbers in a given range while moving the range. For example, let’s say …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Data Warehouse Concepts & Examples

data warehouse concepts and examples

A data warehouse is a system used for reporting and data analysis, and is considered a core component of business intelligence. Data warehouses are centralized repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that can be used to answer business questions. Data warehouses are used to support business intelligence applications. Business intelligence applications are used to make decisions about the operation of the business.  A data warehouse is usually populated with data from an operational database, which contains transactions. The process of populating the data warehouse is called Extract, Transform, and Load (ETL). This process cleans, transforms, and …

Continue reading

Posted in Data, Data Warehouse. Tagged with , .

Data Models Types, Uses & Examples

relational data model

A data model is a collection of concepts that can be used to describe the structure of a database. When it comes to data modeling, there are several different types of models that data analysts and data modelers can use. There are several different types of data models, and each has its own strengths and weaknesses. Some of most popular types of data models are the relational model, the dimensional model, and the hierarchical model. In this blog post, we will provide a brief overview of different types of data model and when you might use each one with the help of real world examples. The Relational Data Model The …

Continue reading

Posted in Data.

Drivetrain Approach for Machine Learning

drivetrain approach for machine learning

In this post, you will learn about a very popular approach or methodology called as Drivetrain approach coined by Jeremy Howard. The approach provides you steps to design data products that provide you with actionable outcomes while using one or more machine learning models. The approach is indeed very useful for data scientists/machine learning enthusiasts at all levels. However, this would prove to be a great guide for data science architects whose key responsibility includes designing the data products.  Without further ado, let’s do a deep dive. Why Drivetrain Approach? Before getting into the drivetrain approach and understands the basic concepts, Lets understand why drivetrain approach in the first place? …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Machine Learning Models Evaluation Techniques

AUC-ROC curve

Machine learning is a powerful machine intelligence technique that can be used to develop predictive models for different types of data. It has become the backbone of many intelligent applications and evaluating machine learning model performance at a regular intervals is key to success of such applications. A machine learning model’s performance depends on several factors including the type of algorithm used, how well it was trained and more. In this blog post, we will discuss  essential techniques for evaluating machine-learning model performance in order to provide you with some best practices when working with machine-learning models. The following are different techniques that can be used for evaluating machine learning …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Machine Learning Programming Languages List

machine learning programming languages

If you’re interested in pursuing a career in machine learning, you’ll need to have a firm grasp of at least one programming language. But with so many languages to choose from, which one should you learn? Here are three of the most popular machine learning programming languages, along with a brief overview of each. Python Python is a programming language with many features that make it well suited for machine learning. It has a large and active community of developers who have contributed a wide variety of libraries and tools. Python’s syntax is relatively simple and easy to learn, making it a good choice for people who are new to …

Continue reading

Posted in Machine Learning, Programming, Python, R. Tagged with , .

NoSQL Databases List & Examples

nosql databases list examples

With the proliferation of big data, there has been a corresponding increase in the number of NoSQL databases. For those who are new to the term, NoSQL databases are non-relational databases that are designed to handle large amounts of data. In this blog post, we will take a look at some of the most popular NoSQL databases. NoSQL databases are a newer alternative to traditional relational databases that are designed to provide more flexibility and scalability. NoSQL databases are often used for big data applications that require real-time analysis or for applications that need to be able to handle a large amount of concurrent users. While NoSQL databases can offer …

Continue reading

Posted in Big Data, Database, NoSQL.