Spend Analytics – 5 Ws of Spend Analysis

spend analytics

In this post, you will learn about 5 Ws of spend analytics. In case you are a procurement professional looking to understand use cases related to spend analytics, you may find this post to be very useful. In simple words, spend analytics is about extracting insights from spend in different procurement categories.  What are we spending on? First and foremost, it is important to get visibility on what items are we spending on. This can be achieved using a dashboard. This form of analytics is also called descriptive analytics. Analyzing item spends can be termed as Item spend analytics. The items can be related to direct or indirect procurement. Indirect …

Python Scraper Code to Search Arxiv Latest Papers

python arxiv library

In this post, you will learn about Python source code related to search Arxiv for relevant and latest machine learning and data science research papers. If you are looking for a faster way to research on Arxiv papers without really going to the Arxiv website, you may want to get this piece of code in your kitty. You can further automate the Arxiv search to get notified based on some logic. Without further ado, let’s get started.  Step 1: Install Python Arxiv Library As a first step, install the Python Arxiv library using the code such as below in your Jupyter notebook or Google colab instance: Step 2: Execute the …

Google News Search Python API Example

In this post, you will learn about how to use GoogleNews search Python library to get or retrieve or scrape news from Google News for last N number of days. This would be very helpful for someone wanting to track new work / projects in relation to machine learning, data science, deep learning or any field including sports, politics etc. Without further ado, lets jump in right away. You can log into Google colab and practise the code.  Step 1: First and foremost, lets install GoogleNews python library. Step 2: Instantiate GoogleNews object. One can pass the language and period to instantiate the object. The parameter, period, represents the news …

Python – How to Create Dictionary using Pandas Series

In this post, you will learn about one of the important Pandas fundamental data structure namely Series and how it can be used as a dictionary. It will be useful for beginner data scientist to understand the concept of Pandas Series object.  A dictionary is a structure that maps arbitrary keys to a set of arbitrary values. Pandas Series is a one-dimensional array of indexed data. It can be created using a list or an array. Pandas Series can be thought of as a special case of Python dictionary. It is a structure which maps typed keys to a set of typed values. Here are the three different ways in …

Support Vector Machine (SVM) Interview Questions – Set 1

neural networks interview questions

This quiz consists of questions and answers on Support Vector Machine (SVM). This is a practice test (objective questions and answers) that can be useful when preparing for interviews. The questions in this and upcoming practice tests could prove to be useful, primarily, for data scientists or machine learning interns/freshers/beginners. The questions are focused on some of the following areas: Introduction to SVM Types of SVM such as maximum-margin classifier, soft-margin classifier, support vector machine Some of the key SVM concepts to understand while preparing for the machine learning interviews are following: SVM concepts and objective functions SVM kernel functions, tricks Concepts of C and Gamma value Scikit learn libraries for …

Free Online Books – Machine Learning with Python

Python data science

This post lists down free online books for machine learning with Python. These books covers topiccs related to machine learning, deep learning, and NLP. This post will be updated from time to time as I discover more books.  Here are the titles of these books: Python data science handbook Building machine learning systems with Python Deep learning with Python Natural language processing with Python Think Bayes Scikit-learn tutorial – statistical learning for scientific data processing Python Data Science Handbook Covers topics such as some of the following: Introduction to Numpy Data manipulation with Pandas Visualization with Matplotlib Machine learning topics (Linear regression, SVM, random forest, principal component analysis, K-means clustering, Gaussian …

42 Free Online Books on Machine Learning & Data Science

Machine Learning Books

This post represents a comprehensive list of 42 free books on machine learning which are available online for self-paced learning.  This would be very helpful for data scientists starting to learn or gain expertise in the field of machine learning / deep learning. Please feel free to comment/suggest if I missed to mention one or more important books that you like and would like to share. Also, sorry for the typos. Following are the key areas under which books are categorized: Pattern Recognition & Machine Learning Probability & Statistics Neural Networks & Deep Learning List of 42 Online Free eBooks on Machine Learning Following is a list of 35 FREE online …

Great Site for Matrix Multiplication Demo

Matrix multiplication demonstration

Here is a great website for the matrix multiplication demo. If you are a beginner data scientist, you will love this. Here is how the website looks like. It has just one page. It actually shows how multiplication happens given the different dimensions of the matrix. Here are few other websites for understanding matrix multiplication concepts: Khan Academy – Matrix multiplication

Different types of Machine Learning Problems

types of learning problems

This post describes the most popular types of machine learning problems using multiple different images/pictures. The following represent various different types of machine learning problems: Supervised learning Unsupervised learning Reinforcement learning Transfer learning Imitation learning Meta-learning In this post, the image shows supervised, unsupervised, and reinforcement learning. You may want to check the explanation on this Youtube lecture video. Unsupervised Learning Problems In unsupervised learning problems, the learning algorithm learns about the structure of data from the given data set and generates fakes or insights. In the above diagram, you may see that what is given is the unlabeled dataset X. The unsupervised learning algorithm learns the structure of data …

Top 10+ Youtube AI / Machine Learning Courses

Online Courses Reskilling

In this post, you get access to top Youtube free AI/machine learning courses. The courses are suitable for data scientists at all levels and cover the following areas of machine learning: Machine learning Deep learning Natural language processing (NLP) Reinforcement learning Here are the details of the free machine learning / deep learning Youtube courses.  S.No Title Description Type 1 CS229: Machine Learning (Stanford) Machine learning lectures by Andrew NG; In case you are a beginner, these lectures are highly recommended Machine learning 2 Applied machine learning (Cornell Tech CS 5787) Covers all of the most important ML algorithms and how to apply them in practice. Includes 3 full lectures …

Difference between Online & Batch Learning

online learning - machine learning system

In this post, you will learn about the concepts and differences between online and batch learning in relation to how machine learning models in production learn incrementally from the stream of incoming data. It is one of the most important aspects of designing machine learning systems. Data science architects would require to get a good understanding of when to go for online learning and when to go for batch or offline learning. What is Batch Learning? Batch learning represents the training of machine learning models in a batch manner. The data get accumulated over a period of time. The models then get trained with the accumulated data from time to …

Scikit-learn vs Tensorflow – When to use What?

scikit learn vs tensorflow

In this post, you will learn about when to use Scikit-learn vs Tensorflow. For data scientists/machine learning enthusiasts, it is very important to understand the difference such that they could use these libraries appropriately while working on different business use cases.  When to use Scikit-learn? Scikit-learn is a great entry point for beginners data scientists. It provides an efficient implementation of many machine learning algorithms. In addition, it is very simple and easy to use. You can get started with Scikit-learn in a very easy manner by using Jupyter notebook. Scikit-learn can be used to solve different kinds of machine learning problems including some of the following: Classification (SVM, nearest neighbors, random …

Data Science Architect Interview Questions

interview questions

In this post, you will learn about interview questions that can be asked if you are going for a data scientist architect job. Data science architect needs to have knowledge in both data science/machine learning and cloud architecture. In addition, it also helps if the person is hands-on with programming languages such as Python & R. Without further ado, let’s get into some of the common questions right away. I will add further questions in the time to come. Q. How do you go about architecting a data science or machine learning solution for any business problem? Solving a business problem using data science or machine learning based solution can …

Drivetrain Approach for Machine Learning

drivetrain approach for machine learning

In this post, you will learn about a very popular approach or methodology called as Drivetrain approach coined by Jeremy Howard. The approach provides you a process to design data products that provide you with actionable outcomes while using one or more machine learning models. The approach is indeed very useful for data scientists/machine learning enthusiasts at all levels. However, this would prove to be a great guide for data science architects whose key responsibility includes designing the data products.  Without further ado, let’s do a deep dive. Why drivetrain approach? Before getting into the drivetrain approach and understands the basic concepts, Lets understand why drivetrain approach in the first …

Machine Learning – Training, Validation & Test Data Set

Training, validation and test data set

In this post, you will learn about the concepts of training, validation, and test data sets used for training machine learning models. The post is most suitable for data science beginners or those who would like to get clarity and a good understanding of training, validation, and test data sets concepts. The following topics will be covered: Data split – training, validation, and test data set  Different model performance based on different data splits Data Splits – Training, Validation & Test Data Sets You can split data into the following different sets and each data split configuration will have machine learning models having different performance: Training data set: When you …

Why use Random Seed in Machine Learning?

random seed value generator

In this post, you will learn about why and when do we use random seed values while training machine learning models. This is a question most likely asked by beginners data scientist/machine learning enthusiasts.  We use random seed value while creating training and test data set. The goal is to make sure we get the same training and validation data set while we use different hyperparameters or machine learning algorithms in order to assess the performance of different models. This is where the random seed value comes into the picture. Different Python libraries such as scikit-learn etc have different ways of assigning random seeds.  While training machine learning models using Scikit-learn, …

