Category Archives: Data Science

Python Pickle Example: What, Why, How

python pickle file example

Have you ever heard of the term “Python Pickle“? If not, don’t feel bad—it can be a confusing concept. However, it is a powerful tool that all data scientists, Python programmers, and web application developers should understand. In this article, we’ll break down what exactly pickling is, why it’s so important, and how to use it in your projects. What is Python Pickle? In its simplest form, pickling is the process of converting any object into a byte stream (a sequence of bytes). This byte stream can then be transmitted over a network or stored in a file for later use. It’s like putting the object into an envelope and …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Feature Importance & Random Forest – Python

Random forest for feature importance

In this post, you will learn about how to use Random Forest Classifier (RandomForestClassifier) for determining feature importance using Sklearn Python code example. This will be useful in feature selection by finding most important features when solving classification machine learning problem. It is very important to understand feature importance and feature selection techniques for data scientists to use most appropriate features for training machine learning models. Recall that other feature selection techniques includes L-norm regularization techniques, greedy search algorithms techniques such as sequential backward / sequential forward selection etc.  What & Why of Feature Importance? Feature importance is a key concept in machine learning that refers to the relative importance of each feature …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Top 10 Basic Computer Science Topics to Learn

computer architecture - basic computer topics to learn

Computer science is an expansive field with a variety of areas that are worth exploring. Whether you’re just starting out or already have some experience in computer science, there are certain topics that every aspiring software engineer should understand. This blog post will cover the basic computer science topics that are essential for any software engineer or software programmer to know. Computer Architecture Computer architecture is a course of study that explores the fundamental elements of computer building and design. It’s an important field of study for software engineers to understand, since it provides basic principles and concepts related to hardware and software interactions. Computer architecture courses typically cover a …

Continue reading

Posted in Data Science, Software Engg.

Free Datasets for Machine Learning & Deep Learning

dataset publicly_available free machine learning

Are you looking for free / popular datasets to use for your machine learning or deep learning project? Look no further! In this blog post, we will provide an overview of some of the best free datasets available for machine learning and deep learning. These datasets can be used to train and evaluate your models, and many of them contain a wealth of valuable information that can be used to address a wide range of real-world problems. So, let’s dive in and take a look at some of the top free datasets for machine learning and deep learning! Here is the list of free data sets for machine learning & …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , .

Difference between Online & Batch Learning

online learning - machine learning system

In this post, you will learn about the concepts and differences between online and batch or offline learning in relation to how machine learning models in production learn incrementally from the stream of incoming data or otherwise. It is one of the most important aspects of designing machine learning systems. Data science architects would require to get a good understanding of when to go for online learning and when to go for batch or offline learning. Why online learning vs batch or offline learning? Before we get into learning the concepts of batch and on-line or online learning, let’s understand why we need different types of models training or learning …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Large language models: Concepts & Examples

large language models concepts examples

Large language models (LLM) have been gaining traction in the world of natural language processing (NLP). These models are trained on large datasets, which contain hundreds of millions to billions of words. LLMs, as they are known, rely on complex algorithms that sift through large datasets and recognize patterns at the word level. This data helps the model better understand natural language and how it is used in context. Through this understanding, these models can generate more accurate results when processing text. Let’s take a deeper look into understanding large language models and why they are important. What are large language models (LLM) and how do they work? Large language …

Continue reading

Posted in Data Science, Machine Learning, NLP.

Most Common Machine Learning Tasks

common machine learning tasks

This article represents some of the most common machine learning tasks that one may come across while trying to solve machine learning problems. Also listed is a set of machine learning methods that could be used to resolve these tasks. Please feel free to comment/suggest if I missed mentioning one or more important points. Also, sorry for the typos. You might want to check out the post on what is machine learning?. Different aspects of machine learning concepts have been explained with the help of examples. Here is an excerpt from the page: Machine learning is about approximating mathematical functions (equations) representing real-world scenarios. These mathematical functions are also referred …

Continue reading

Posted in AI, Big Data, Data Science, Machine Learning. Tagged with , .

Moving Average Method for Time-series forecasting

Moving average definition & examples

In this post, you will learn about the concepts of the moving average method in relation to time-series forecasting. You will get to learn Python examples in relation to training a moving average machine learning model.  The following are some of the topics which will get covered in this post: What is the moving average method? Why use the moving average method? Python code example for the moving average methods What is Moving Average method? The moving average is a statistical method used for forecasting long-term trends. The technique represents taking an average of a set of numbers in a given range while moving the range. For example, let’s say …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Gradient Boosting Algorithm: Concepts, Example

gradient boosting algorithm error vs iterations

If you are a data scientist or machine learning engineer, then you know that Gradient Boosting Algorithm (GBA) is one of the most powerful algorithms in predicting results from data. This algorithm has been proven to increase the accuracy of predictions and is becoming increasingly popular among data scientists. Let’s take a closer look at GBA and explore how it works with an example.   What is a Gradient Boosting Algorithm? Gradient boosting algorithm is a machine learning technique used to build predictive models. It creates an ensemble of weak learners, meaning that it combines several smaller, simpler models in order to obtain a more accurate prediction than what an …

Continue reading

Posted in Data Science, Machine Learning.

Feature Scaling in Machine Learning: Python Examples

In this post you will learn about a simple technique namely feature scaling with Python code examples using which you could improve machine learning models. The models will be trained using Perceptron (single-layer neural network) classifier. First and foremost, lets quickly understand what is feature scaling and why one needs it? What is Feature Scaling and Why does one need it? Feature scaling is a method used to standardize the range of independent variables or features of data. In data processing, it is also known as data normalization or standardization. Feature scaling is generally performed during the data pre-processing stage, before training models using machine learning algorithms.  The goal is to …

Continue reading

Posted in AI, Data Science, Machine Learning. Tagged with , , .

Drivetrain Approach for Machine Learning

drivetrain approach for machine learning

In this post, you will learn about a very popular approach or methodology called as Drivetrain approach coined by Jeremy Howard. The approach provides you steps to design data products that provide you with actionable outcomes while using one or more machine learning models. The approach is indeed very useful for data scientists/machine learning enthusiasts at all levels. However, this would prove to be a great guide for data science architects whose key responsibility includes designing the data products.  Without further ado, let’s do a deep dive. Why Drivetrain Approach? Before getting into the drivetrain approach and understands the basic concepts, Lets understand why drivetrain approach in the first place? …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Machine Learning Models Evaluation Techniques

AUC-ROC curve

Machine learning is a powerful machine intelligence technique that can be used to develop predictive models for different types of data. It has become the backbone of many intelligent applications and evaluating machine learning model performance at a regular intervals is key to success of such applications. A machine learning model’s performance depends on several factors including the type of algorithm used, how well it was trained and more. In this blog post, we will discuss  essential techniques for evaluating machine-learning model performance in order to provide you with some best practices when working with machine-learning models. The following are different techniques that can be used for evaluating machine learning …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Data Preprocessing Steps in Machine Learning

data preprocessing in machine learning

Data preprocessing is an essential step in any machine learning project. By cleaning and preparing your data, you can ensure that your machine learning model is as accurate as possible. In this blog post, we’ll cover some of the important and most common data preprocessing steps that every data scientist should know. Replace/remove missing data Before building a machine learning model, it is important to preprocess the data and remove or replace any missing values. Missing data can cause problems with the model, such as biased results or inaccurate predictions. There are a few different ways to handle missing data, but the best approach depends on the situation. In some …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Resume Screening using Machine Learning & NLP

resume screening and shortlisting using machine learning

In today’s job market, there are many qualified candidates vying for the same position. So, how do you weed out the applicants who are not a good fit for your company? One way to do this is by using machine learning and natural language processing (NLP) to screen resumes. By using machine learning and NLP to screen resumes, you can more efficiently identify candidates who have the skills and qualifications you are looking for. In this blog, we will learn different aspects of screening and selecting / shortlisting candidates for further processing using machine learning & NLP techniques.  Key Challenges for Resume Screening / Shortlisting Resume screening is the process …

Continue reading

Posted in Data Science.

Bagging vs Boosting Machine Learning Methods

boosting vs bagging differences examples

In machine learning, there are a variety of methods that can be used to improve the performance of your models. Two of the most popular methods are bagging and boosting. In this blog post, we’ll take a look at what these methods are and how they work with the help of examples. What is Bagging? Bagging, short for “bootstrap aggregating”, is a method that can be used to improve the accuracy of your machine learning models. The idea behind bagging is to train multiple models on different subsets of the data and then combine the predictions of those models. The data is split into a number of smaller datasets, or …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Generative vs Discriminative Models Examples

generative vs discriminative models

f you’re working in the field of machine learning, it’s important to understand the difference between generative and discriminative models. These two types of models are both used in supervised learning, but they approach the problem in different ways. In this blog post, we’ll take a look at what generative and discriminative models are, how they work, and some examples of each. What are Generative Models? Generative models are a type of machine learning algorithm that is used to generate new data samples based on a training set. For example, a generative model could be trained on a dataset of pictures of cats, and then used to generate new cat …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .