Tag Archives: Data Science

Kruskal Wallis H Test Formula, Python Example

Kruskal Wallis H Test: Formula, Python Example

Ever wondered how to find out if different groups of people have different preferences? Maybe you’re a marketer trying to understand if different age groups prefer different features in a smartphone. Or perhaps you’re a public policy researcher, trying to determine if different neighborhoods are equally satisfied with their local services. How do you go about answering these questions, especially when the data doesn’t follow the typical bell-shaped curve or normal distribution? The solution lies in the Kruskal-Wallis H Test! This is a non-parametric test that helps to compare more than two independent groups and it comes in really handy when the data is not bell-shaped curve data or not …

Continue reading

Posted in Data Science, Python, statistics. Tagged with , , .

Weighted Regression Model Python Examples

Weighted regression model python example

Have you ever wondered how regression models can be enhanced to provide more accurate predictions, even in the presence of outliers or data points with varying significance? Enter weighted regression machine learning models, an approach that assigns weights to data points, allowing for precise adjustments and improvements in prediction accuracy. In this blog post, we will learn about the concepts of weighted regression models with the help of examples while demonstrating with the help of Python implementation. Traditional linear regression is a widely-used technique, but it may struggle when faced with outliers or situations where some data points carry more weight than others. However, weighted regression models help overcome these …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Clinical Trials & Statistics Use Cases: Examples

clinical trials predictive analytics machine learning use cases

Are you a statistician, data scientist or business analyst working in the field of clinical trials? Do you find yourself curious about how statistical analyses play a pivotal role in unlocking valuable actionable insights and driving critical decisions in drug development? If so, in this blog, we will learn about various different use cases where clinical trials and statistics intersect. Clinical trials are the backbone of evidence-based medicine, paving the way for the discovery and development of innovative therapies that can improve patient outcomes. Within this realm, statistics allows researchers and analysts to make sense of complex data, evaluate treatment efficacy, assess safety profiles, and optimize trial design. In this …

Continue reading

Posted in Clinical Trials, Data Science, Drug Discovery, Pharma, statistics. Tagged with , , , .

Spearman Correlation Coefficient: Formula, Examples

spearman-rank-correlation-coefficient-visualization

Have you ever wondered how you might determine the relationship between two sets of data that aren’t necessarily linear, or perhaps don’t adhere to the assumptions of other correlation measures? Enter the Spearman Rank Correlation Coefficient, a non-parametric statistic that offers robust insights into the monotonic relationship between two variables – perfect for dealing with ranked variables or exploring potential relationships in a new, exploratory dataset. In this blog post, we will learn the concepts of Spearman correlation coefficient with the help of Python code examples. Understanding the concept can prove to be very helpful for data scientists. Whether you’re exploring associations in marketing data, results from a customer satisfaction …

Continue reading

Posted in Data Science, Python, statistics. Tagged with , , .

Heteroskedasticity in Regression Models: Examples

heteroskedasticity-regression-models-examples

Have you ever encountered data that exhibits varying patterns of dispersion and wondered how it might impact your regression models? The varying patterns of dispersion represents the essence of heteroskedasticity – the phenomenon where the spread or variability of the residuals / errors in a regression model changes across different levels or values of the independent variables. As data scientists, understanding the concept of heteroskedasticity is crucial for robust and accurate analyses. In this blog, we delve into the intriguing world of heteroskedasticity in regression models and explore its implications through real-world examples. What’s heteroskedasticity and why learn this concept? Heteroskedasticity refers to a statistical phenomenon observed in regression analysis, …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Loan Eligibility / Approval & Machine Learning: Examples

loan eligibility prediction using machine learning

It is no secret that the loan industry is a multi-billion dollar industry. Lenders make money by charging interest on loans, and borrowers want to get the best loan terms possible. In order to qualify for a loan, borrowers are typically required to provide information about their income, assets, and credit score. This process can be time consuming and frustrating for both lenders and borrowers. In this blog post, we will discuss how AI / machine learning can be used to predict loan eligibility. As data scientists, it is of great importance to understand some of challenges in relation to loan eligibility and how machine learning models can be built …

Continue reading

Posted in AI, Banking, Finance, Machine Learning. Tagged with , , , , .

Credit Risk Modeling & Machine Learning Use Cases

credit risk modeling and machine learning use cases

Have you ever wondered how banks and financial institutions decide who to lend money to, or how much to lend? The secret lies in credit risk modeling, a sophisticated approach that evaluates the likelihood of a borrower defaulting on their loan. Through in-depth analysis of historical data and borrower’s credit behavior, these models play a pivotal role in guiding lending decisions, managing risks, and ultimately, driving profitability. In the face of growing financial complexities, traditional methods are often insufficient. That’s where machine learning comes into play that helps better anticipate credit risk. By automating the identification of patterns within data, patterns that often go unnoticed by human analysis, machine learning …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Matplotlib Bar Chart Python / Pandas Examples

bar-chart-using-matplotlib-pandas-and-python-3

Are you looking to learn how to create bar charts / bar plots / bar graph using the combination of Matplotlib and Pandas in Python? Bar charts are one of the most commonly used visualizations in data analysis, enabling us to present categorical data in a visually appealing and intuitive manner. Whether you’re a beginner data scientist or an intermediate-level practitioner seeking to enhance your visualization skills, this blog will provide you with practical examples and hands-on guidance to create compelling bar charts / bar plots using Matplotlib libraries in Python. You will also learn how to leverage the data manipulation capabilities of Pandas to prepare the data for visualization, …

Continue reading

Posted in Data Science, Python. Tagged with , .

One-hot Encoding Concepts & Python Examples

One-hot encoding concepts and python examples

Have you ever encountered categorical variables in your data analysis or machine learning projects? These variables represent discrete qualities or characteristics, such as colors, genders, or types of products. While numerical variables can be directly used as inputs for machine learning algorithms, categorical variables require a different approach. One common technique used to convert categorical variables into a numerical representation is called one-hot encoding, also known as dummy encoding. When working with machine learning algorithms, categorical variables need to be transformed into a numerical representation to be effectively used as inputs. This is where one-hot encoding comes to rescue. In this post, you will learn about One-hot Encoding concepts and …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Ridge Regression Concepts & Python example

Ridge regression cost function 2

Ridge regression is a type of linear regression that penalizes ridge coefficients. This technique can be used to reduce the effects of multicollinearity in ridge regression, which may result from high correlations among predictors or between predictors and independent variables. In this tutorial, we will explain ridge regression with a Python example. What is Ridge Regression? Ridge regression is a powerful technique in machine learning that addresses the issue of overfitting in linear models. In linear regression, we aim to model the relationship between a response variable and one or more predictor variables. However, when there are multiple variables that are highly correlated, the model can become too complex and …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Machine Learning NPTEL Online Courses List 2023

Machine learning is a rapidly evolving field that has gained immense popularity in recent years. As technology continues to advance, the demand for professionals with expertise in machine learning continues to soar. If you’re someone who is interested in diving deep into the world of machine learning or looking to enhance your existing knowledge, the NPTel courses are an excellent avenue to explore. The National Programme on Technology Enhanced Learning (NPTel) is a joint initiative by the Indian Institutes of Technology (IITs) and the Indian Institute of Science (IISc). It offers a wide range of online courses across various disciplines, including computer science and engineering. In this blog, we will …

Continue reading

Posted in AI, Career Planning, Data Science, Machine Learning, Online Courses. Tagged with , , .

Online US Degree Courses & Programs in AI / Machine Learning

online degree courses and programs in machine learning in US Universities

Data Science & AI / Machine learning has emerged as a transformative field, revolutionizing industries and shaping the future of technology. As the demand for professionals skilled in machine learning continues to rise, top universities in the United States (USA) have recognized the need to offer online degree courses and programs in this dynamic field. Through these online offerings, students can now access world-class education and earn prestigious degrees from the comfort of their own homes, while benefiting from the expertise of renowned faculty members. In this blog post, we present a curated list of leading US universities that provide online degree courses and programs in machine learning. Whether you …

Continue reading

Posted in Career Planning, Online Courses. Tagged with , , , .

Recommender Systems in Machine Learning: Examples

collaborative filtering - recommender system

Recommender systems are used in machine learning to predict the ratings or preferences of items for a given user. They are commonly used in e-commerce applications to suggest items that a user may be interested in. One common example of a recommender system is Netflix. Netflix uses a recommender system to suggest movies and TV shows that a user may want to watch. The algorithm looks at past ratings and preferences to make suggestions. In this blog post, you will learn about recommender systems and some of the different types of recommender systems with the help of examples. Recommender systems make use of machine learning to predict the ratings or …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .

Binomial Distribution Explained with Examples

binomial experiment coin tossing 100 experiments 50 trials

Have you ever wondered how to predict the number of successes in a series of independent trials? Or perhaps you’ve been curious about the probability of achieving a specific outcome in a sequence of yes-or-no questions. If so, we are essentially talking about the binomial distribution. It’s important for data scientists to understand this concept as binomials are used often in business applications. The binomial distribution is a discrete probability distribution that applies to binomial experiments (experiments with binary outcomes). It’s the number of successes in a specific number of trials. Sighting a simple yet real-life example, the binomial distribution may be imagined as the probability distribution of a number …

Continue reading

Posted in AI, Data Science, Machine Learning, statistics. Tagged with , , .

Difference between Data Science & Data Analytics

data science vs data analytics

What’s the difference between data science and data analytics? Many people use these terms interchangeably, but there is a big distinction between the two fields. Data science is more focused on understanding and deriving insights from data while leveraging statistical and machine learning methods, while data analytics is an overarching term used to solve problems using analytical techniques while leveraging data. Both the terms are in a way related. In this blog post, we’ll explore the differences between data science and data analytics in greater detail, with examples of each. The following are key topics in relation to the difference between data science and data analytics: Different forms/purposes Different techniques …

Continue reading

Posted in Data analytics, Data Science. Tagged with , .

Hold-out Method for Training Machine Learning Models

Hold-out-method-Training-Validation-Test-Dataset

The hold-out method for training the machine learning models is a technique that involves splitting the data into different sets: one set for training, and other sets for validation and testing. The hold-out method is used to check how well a machine learning model will perform on the new data.  In this post, you will learn about the hold-out method used during the process of training the machine learning model. Do check out my post on what is machine learning? concepts & examples for a detailed understanding of different aspects related to the basics of machine learning. Also, check out a related post on what is data science? When evaluating …

Continue reading

Posted in Data Science, Machine Learning. Tagged with , .