Author Archives: Ajitesh Kumar
Lime Machine Learning Python Example
Today when core businesses have started relying on machine learning (ML) models predictions, interpreting complex models has become a necessary requirement of AI governance (responsible AI). Data scientists are often asked to explain the inner workings of a machine learning models for understanding how the decisions are made. The Problem? Many of these models stand out as “black boxes“, delivering predictions without any comprehensible reasoning. This lack of transparency (especially in healthcare & finance use cases) can lead to mistrust in model predictions and inhibit the practical application of machine learning in fields that require a high degree of interpretability. It could lead to erroneous decision-making, or worse, legal and …
Boston Housing Dataset Linear Regression: Predicting House Prices
Predicting house prices accurately is crucial in the real estate industry. However, it can be challenging to determine the factors that significantly impact house prices. Without a clear understanding of these factors, accurate predictions are difficult to achieve. The Boston Housing Dataset addresses this problem by providing a comprehensive set of variables that influence house prices in the Boston area. However, effectively utilizing this dataset and building robust predictive models require appropriate techniques and evaluation methods. In this blog, we will provide an overview of the Boston Housing Dataset and explore linear regression, LASSO, and Ridge regression as potential models for predicting house prices. Each model has its unique properties …
ChatGPT Cheat Sheet for Data Scientists
With the explosion of data being generated, data scientists are facing increased pressure to analyze and interpret large amounts of text data effectively. However, this can be a challenging task, especially when dealing with unstructured data. Additionally, data scientists often spend a significant amount of time manually generating text and answering complex questions, which can be a time-consuming process. Welcome ChatGPT! ChatGPT offer a powerful solution to these challenges. By learning different ChatGPT prompts, data scientists can significantly become super productive while generating relevant insights, answer complex questions, and perform machine learning tasks with ease such as data preprocessing, hypothesis testing, training models, etc. In this blog, I will provide …
How does Dall-E 2 Work? Concepts, Examples
Have you ever wondered how generative AI is converting words into images? Or how generative AI models create a picture of something you’ve only described in words? Creating high-quality images from textual descriptions has long been a challenge for artificial intelligence (AI) researchers. That’s where DALL-E and DALL-E 2 comes in. In this blog, we will look into the details related to Dall-E 2. Developed by OpenAI, DALL-E 2 is a cutting-edge AI model that can generate highly realistic images from textual descriptions. So how does DALL-E 2 work, and what makes it so special? In this blog post, we’ll explore the key concepts and techniques behind DALL-E 2, including …
Facebook Responsible AI: Lessons, Examples
As technology continues to advance, it’s important that we prioritize ethical considerations and ensure that the development and deployment of AI technologies are responsible and fair. Meta (formerly known as Facebook) recognizes the importance of responsible AI and has taken several steps to ensure that their AI systems are developed and deployed in an ethical and fair manner. In this blog post, we’ll be exploring the latest responsible AI updates from Meta, which every company should take into consideration when developing and implementing their own AI strategies and systems. I will keep the blog short and crisp. If you want greater details, visit this page. Use Varied Datasets & Robust …
Python Tesseract PDF & OCR Example
Have you ever needed to extract text from an image or a PDF file? If so, you’re in luck! Python has an amazing library called Tesseract that can perform Optical Character Recognition (OCR) to extract text from images and PDFs. In this blog, I will share sample Python code using with you can use Tesseract to extract text from images and PDFs. As a data scientist, it can be very helpful and useful to be able to extract text from images or PDFs, especially when working with large amounts of data found in receipts, invoices, etc. Tesseract is an OCR engine widely used in the industry, known for its accuracy …
ChatGPT Prompt to get Datasets for Machine Learning
As the field of machine learning continues to expand, having access to high-quality datasets has become increasingly important. Datasets are the foundation of any machine learning project and play a crucial role in determining the accuracy and effectiveness of the resulting model. In this blog post, we will learn about a template ChatGPT prompt that can be used to gather a variety of datasets for different types of machine learning tasks. As data scientists As data scientists, it is recommended that we use a systematic approach to identify and select the right dataset for our machine learning project. This involves considering the specific requirements of our project, such as the …
Gaussian Mixture Models: What are they & when to use?
In machine learning and data analysis, it is often necessary to identify patterns and clusters within large sets of data. However, traditional clustering algorithms such as k-means clustering have limitations when it comes to identifying clusters with different shapes and sizes. This is where Gaussian mixture models (GMMs) come in. But what exactly are GMMs and when should you use them? Gaussian mixture models (GMMs) are a type of machine learning algorithm. They are used to classify data into different categories based on the probability distribution. Gaussian mixture models can be used in many different areas, including finance, marketing and so much more! In this blog, an introduction to gaussian …
Python: Convert JSON to CSV Example
Have you ever wondered how to convert JSON data to CSV using Python? JSON (JavaScript Object Notation) is a popular data format used to exchange data between servers and web applications. However, sometimes it’s necessary to convert this data into another format, such as CSV (Comma Separated Values). CSV is a simple text format that is commonly used to store and exchange tabular data. In this blog post, a sample Python code is provided for converting JSON to CSV using Python. The code showcases the Python code that uses the json and csv modules to read and write data. But before going forward with the code, let’s take a look …
Seaborn: Multiple Line Plots with Markers, Legend
Do you want to learn how to create visually stunning and informative line plots that will captivate your audience by providing most apt information? Do you have the requirement of creating multiple line plots in the same figure representing sales of different products across different months in a year? Are you looking for a takeaway Python code with Seaborn library for creating line plots? If yes, you are in the right place. In this blog post, we’ll explore how to create multiple line plots with Seaborn, a powerful data visualization library built on top of Matplotlib. I will also show how to add markers to the line plots to make …
ChatGPT for Data Science Projects – Examples
Data science is all about turning raw data into actionable insights and outcomes that drive value for your organization. But as any data science professional knows, coming up with new, innovative ideas for your projects is only half the battle. The real challenge is finding a way to turn those ideas into results that can be used to drive business success by doing proper data analysis and building machine learning models using most appropriate algorithms. Unfortunately, many data science professionals struggle with this second step, which can lead to frustration, wasted time and resources, and missed opportunities. That’s where ChatGPT comes in. As a language model trained by OpenAI, ChatGPT …
Hypothesis Testing in Business: Examples
Are you a product manager or data scientist looking for ways to identify and use most appropriate hypothesis testing for understanding business problems and creating solutions for data-driven decision making? Hypothesis testing is a powerful statistical technique that can help you understand problems during exploratory data analysis (EDA) and identify most appropriate hypotheses / analytical solution. In this blog, we will discuss hypothesis testing with examples from business. We’ll also give you tips on how to use it effectively in your own problem-solving journey. With this knowledge, you’ll be able to confidently create hypotheses, run experiments, and analyze the results to derive meaningful conclusions. So let’s get started! Before going …
NLP: Huggingface Transformers Code Examples
Do you want to build cutting-edge NLP models? Have you heard of Huggingface Transformers? Huggingface Transformers is a popular open-source library for NLP, which provides pre-trained machine learning models and tools to build custom NLP models. These models are based on Transformers architecture, which has revolutionized the field of NLP by enabling state-of-the-art performance on a range of NLP tasks. In this blog post, I will provide Python code examples for using Huggingface Transformers for various NLP tasks such as text classification (sentiment analysis), named entity recognition, question answering, text summarization, and text generation. I used Google Colab for testing my code. Before getting started, get set up with transformers …
Sklearn Algorithms Cheat Sheet with Examples
The Sklearn library, short for Scikit-learn, is one of the most popular and widely-used libraries for machine learning in Python. It offers a comprehensive set of tools for data analysis, preprocessing, model selection, and evaluation. As a beginner data scientist, it can be overwhelming to navigate the various algorithms and functions within Sklearn. This is where the Sklearn Algorithms Cheat Sheet comes in handy. This cheat sheet provides a quick reference guide for beginners to easily understand and select the appropriate algorithm for their specific task. In this cheat sheet, I have compiled a list of common supervised and unsupervised learning algorithms, along with their Sklearn classes and example use …
Andrew Ng & OpenAI ChatGPT Prompt Engineering Course
Renowned artificial intelligence (AI) experts, Andrew Ng from DeepLearning.ai and Isa Fulford from OpenAI, have teamed up to offer an exciting new course on prompt engineering, titled “ChatGPT Prompt Engineering for Developers“. The course, which is completely free, aims to help developers better understand the prompts design and implementation for various use cases. The ChatGPT Prompt Engineering course is specifically tailored for developers including data scientists who wish to learn more about designing prompts for different tasks including software development (coding), marketing, creating product reviews & description, writing essay, summarizing text etc. It includes several important topics such as summarizing, inferring, transforming, expanding and chatbot building. These skills are essential …
I found it very helpful. However the differences are not too understandable for me