# Category Archives: Data Science

## ML Engineer vs Data Scientist: Differences & Similarities

In today’s world, ML (machine learning) engineer and Data scientist are two popular job positions. These positions have a lot of overlap but there are also some key differences to be aware of. In this blog post, we will go over the details of ML engineers vs Data scientists so you can decide which one is right for you! What does an ML engineer do? An ML engineer primarily designs and develops machine learning systems. Before getting into the roles & responsibilities of an ML engineer, let’s understand what is a machine learning system. A machine learning system can be defined as a system that comprises of one or more …

## Week Nov1, 2021: Top 3 Machine Learning Tutorial Videos

The field of machine learning is a vast topic and it can be hard to know where to start. In this blog post, we’ll cover the top three free tutorial videos on machine learning from YouTube published this week (Week of Nov 1, 2021). These videos will help you get started with the basics of machine learning & deep learning, introduce you to some popular algorithms in use today, and give you an idea of what’s possible when building a model from scratch. Build a Machine Learning Project From Scratch with Python and Scikit-learn Let’s say you want to build a machine learning project from scratch. Maybe you’re not sure …

## Support Vector Machine (SVM) Interview Questions

Support Vector Machine (SVM) is a machine learning algorithm that can be used to classify data. SVM does this by maximizing the margin between two classes, where “margin” refers to the distance from both support vectors. SVM has been applied in many areas of computer science and beyond, including medical diagnosis software for tuberculosis detection, fraud detection systems, and more. This blog post consists of quiz comprising of questions and answers on SVM. This is a practice test (objective questions and answers) that can be useful when preparing for interviews. The questions in this and upcoming practice tests could prove to be useful, primarily, for data scientists or machine learning interns/ …

## Machine Learning Examples from Daily Life

Machine learning is a powerful machine intelligence technique that can be used in a variety of settings to generate data insights. In this blog post, we will explore real-world or real-life machine learning / deep learning / AI examples from daily life. We’ll see how machine-learning techniques have been successfully applied to solve real-life problems. The idea is to make you aware of how machine learning and data science applications are everywhere. What are some real-world examples of machine learning from daily life? Here are some real-world examples of machine learning that we use in our daily life: Best driving directions (Google Maps): A bunch of machine learning / deep …

## 8 Months Data Science Program from IIT Chennai

Are you looking for a new job or build your career in the field of data science? Data science is the hottest career in India right now. It is one of the most sought after skills today. It’s not just about crunching numbers anymore – it’s an exciting, dynamic field that requires creativity and critical thinking. With data science, you can solve problems and make a real impact on society. And with IIT Madras offering diplomas in data science, there has never been a better time to get started! IIT Madras has launched its diploma program in Data Science for college students, working professionals and job seekers who aim to …

## Stock Price Prediction using Machine Learning Techniques

In the past few decades, many advances have been made in the field of data analytics. Researchers are now able to predict stock prices with higher accuracy due to analytical predictive models. These predictive techniques utilize data from previous stock price movements and look for patterns that could indicate future stock price changes in the market. The use of these machine learning techniques will allow investors to make better decisions and invest more wisely by maximizing their returns and minimizing their losses. In this blog post, you will learn about some of the popular machine learning techniques in relation to making stock price movement (direction of stock price) predictions and …

## Difference between Parametric vs Non-Parametric Models

Machine learning models can be parametric or non-parametric. Parametric models are those that require the specification of some parameters before they can be used to make predictions, while non-parametric models do not rely on any specific parameter settings and therefore often produce more accurate results. This blog post discusses parametric vs non-parametric machine learning models with examples along with the key differences. What are parametric and non-parametric models? Training machine learning models is about finding a function approximation built using input or predictor variables, and whose output represents the response variable. The reason why it is called function approximation is because there is always an error in relation to the …

## Overfitting & Underfitting Concepts & Interview Questions

Machine learning models are built to learn from training and test data and make predictions on new, unseen data set. The machine learning model is said to overfit the data when it learns patterns that exist only in the training set make prediction with high accuracy. On the other hand, machine learning model underfits if it cannot find any pattern or relationship between variables in both training and testing data sets. In this post, you will learn about some of the key concepts of overfitting and underfitting in relation to machine learning models. In addition, you will also get a chance to test you understanding by attempting the quiz. The …

## Data Readiness Levels Assessment: Concepts

Data readiness levels (DRLs) and related assessments are an important part of data analytics. Data readiness levels is a concept where different stages represent the quality and maturity of data. Data science is becoming increasingly popular, but not all companies have the right level of data readiness for this type of work. Performing data readiness levels assessment is important because it gives an insight into the quality and quantity of your current datasets and helps determine future success of the data analytics project. This blog post will explain what data readiness levels are and why assessment tests are important in relation to them. What are data readiness levels? Data readiness …

## Data Science / AI Team Structure – Roles & Responsibilities

Setting up a successful artificial intelligence (AI) / data science or advanced analytics practice or center of excellence (CoE) is key to success of AI in your organization. In order to setup a successful data science COE, setting up a well-organized data science team with clearly defined roles & responsibilities is the key. Are you planning to set up the AI or data science team in your organization, and hence, looking for some ideas around data science team structure and related roles and responsibilities? In this post, you will learn about some of the following aspects related to the building data science/machine learning team. Focus areas Roles & responsibilities Data Science Team – Focus …

## Clinical Trials & Predictive Analytics Use Cases

Analytics plays a big role in modeling clinical trials and predictive analytics is one such technique that has been embraced by clinical researchers. Machine learning algorithms can be applied at various stages in the drug discovery process – from early compound selection to clinical trial simulation. Data scientists have been applying machine learning algorithms to clinical trial data in order to identify predictive patterns and correlations between clinical outcomes, patient demographics, drug response phenotypes, medical history, and genetic information. Predictive analytics has the potential to enhance clinical research by helping accelerate clinical trials through predictive modeling of clinical outcome probability for better treatment decisions with reduced clinical trial costs. In …

## Local & Global Minima Explained with Examples

Optimization problems containing many local minima remains a critical problem in a variety of domains, including operations research, informatics, and material design. Efficient global optimization remains a problem of general research interest, with applications to a range of fields including operations design, network analysis, and bioinformatics. Within the fields of chemical physics and material design, efficient global optimization is particularly important for finding low potential energy configurations of isolated groups of atoms (clusters) and periodic systems (crystals). In case of Machine learning (ML) algorithms, theer is a need for optimising (minimising) the cost or loss function. In order to become very good at finding solutions to optimisation problems (relating to minimising …

## Binomial Distribution Explained with Examples

The binomial distribution is a probability distribution that applies to binomial experiments. It’s the number of successes in a specific number of tries. The binomial distribution may be imagined as the probability distribution of a number of heads that appear on a coin flip in a specific experiment comprising of a fixed number of coin flips. In this blog post, we will learn binomial distribution with the help of examples. If you are an aspiring data scientist looking forward to learning/understand the binomial distribution in a better manner, this post might be very helpful. What is a Binomial Distribution? The binomial distribution is a discrete probability distribution that represents the probabilities of binomial random …

## Python – Replace Missing Values with Mean, Median & Mode

Missing values are common in dealing with real-world problems when the data is aggregated over long time stretches from disparate sources, and reliable machine learning modeling demands for careful handling of missing data. One strategy is imputing the missing values, and a wide variety of algorithms exist spanning simple interpolation (mean. median, mode), matrix factorization methods like SVD, statistical models like Kalman filters, and deep learning methods. Missing value imputation or replacing techniques help machine learning models learn from incomplete data. There are three main missing value imputation techniques – mean, median and mode. Mean is the average of all values in a set, median is the middle number in …

## Building Machine Learning Models & Dev Challenges

The machine learning models and AI implementation industry is booming. The demand for machine learning models has never been higher, but the challenges of machine learning development and deployment have also increased. In this post, we will discuss a few common machine learning development and deployment challenges. In future blogs, we will learn about solutions to overcome these challenges. This blog post will help you learn and understand some of the key challenges that you may face if you are planning to start machine learning practice in your organization. These challenges are also very much relevant if you have machine learning engineers and data scientists working across different offices/locations on …

## Fixed vs Random vs Mixed Effects Models – Examples

Have you ever wondered what fixed effect, random effect and mixed effects models are? Or, more importantly, how they differ from one another? In this post, you will learn about the concepts of fixed and random effects models along with when to use fixed effects models and when to go for fixed + random effects (mixed) models. The concepts will be explained with examples. As data scientists, you must get a good understanding of these concepts as it would help you build better linear models such as general linear mixed models or generalized linear mixed models (GLMM). What are fixed, random & mixed effects models? First, we will take a real-world example and try and understand …