Author Archives: Ajitesh Kumar

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Key Architectural Components of a Data Lake

data lake architectural components

Data lakes are data storage systems that allow data to be stored, managed and accessed in a way that is cost-effective and scalable. They can provide a significant competitive advantage for any organization by enabling data-driven decision-making, but they also come with challenges in architecture design. In this blog post, we will explore the different components of data lakes, including the data lake architecture. Before getting to learn about data lake architectural component, lets quickly recall what is a data lake. What is a data lake? A data lake is a data storage system that allows data to be stored, managed, and accessed in a way that is cost-effective and …

Continue reading

Posted in Architecture, Data analytics, Data lake. Tagged with , .

14 Python Automl Frameworks Data Scientists Can Use

Python automl frameworks

In this post, you will learn about Automated Machine Learning (AutoML) frameworks for Python that can use to train machine learning models. For data scientists, especially beginners, who are unfamiliar with Automl, it is a tool designed to make the process of generating machine learning models in an automated manner, user-friendly, and less time-consuming. The goal of Automl is not just about making it easier for machine learning (ML) developers but also democratizing access to model development. What is AutoML? AutoML refers to automating some or all steps of building machine learning models, including selection and configuration of training data, tuning the performance metric(s), selecting/constructing features, training multiple models, evaluating …

Continue reading

Posted in Data Science, Machine Learning, Python. Tagged with , , .

Data Analytics – Different Career Options / Opportunities

data analytics career options

Data analytics career paths span a wide range of career options, from data scientist to data engineer. Data scientists are often interested in what they can do with the data that is analyzed, while data engineers are more focused on the analysis itself. Whether you’re looking for a career as a data scientist, data analyst, ML engineer, or AI researcher, there’s something for everyone! In this blog post, we will different types of jobs and careers available to those interested in data analytics and data science. What are some of the career paths in data analytics? Here are different career paths for those interested in data analytics career: Data Scientists: …

Continue reading

Posted in AI, Career Planning, Data analytics, data engineering, Data Science, Machine Learning. Tagged with , , , .

Using Theory of Change to Design Data-driven Solutions

theory of change for data-driven decision making

Have you ever wanted to design a solution for an issue but weren’t sure how to do it? One theory that can help is the theory of change. The theory of change provides a framework for designing solutions by focusing on the steps needed to achieve desired outcomes or results. It also helps identify what needs to happen in order for the solution to be implemented successfully and realizing the desired outcomes. The theory of change when combined with data-driven decision making can result in great impact. In order to design solutions that have an impact and are sustainable, it is important to understand the theory of change as well …

Continue reading

Posted in Data analytics, Data Science. Tagged with , , .

Top 50 Interview Questions for Beginner Data Scientists

interview questions for machine learning

What interview questions should a beginner data scientist prepare for? This is an important question that many interviewees have. If you are going for a data scientist interview and don’t know what interview questions will you be asked, this blog post has some of the common interview questions that will help you excel in your interview. These interview questions are perfect for beginners because they cover basic topics about data science and machine learning and how it works. We hope this list helps! What is the difference between AI, machine learning, deep learning? Do you know how machine learning works? How is machine learning different from statistical modeling techniques like linear …

Continue reading

Posted in Data Science, Interview questions, Machine Learning. Tagged with , , .

How to Create & Detect Deepfakes Using Deep Learning

create and detect deepfake using deep learning

Deepfake are becoming a more common occurrence in today’s world. What is deepfake and how can you create it using deep learning? This blog post will help data scientists learn techniques for creating and detecting deepfakes, so they can stay ahead of this technology. A deepfake is a video or audio that alters reality by changing the way something appears. For example, someone could place your face onto someone else’s body in a video to make it seem like you were there when you really weren’t. There are many ways that one can detect if a photo has been manipulated with software such as Photoshop or Gimp. What is deepfake? …

Continue reading

Posted in Data Science, Deep Learning, Machine Learning. Tagged with , , .

50+ Machine learning & Deep learning Youtube Courses

In this post, you get an access to curated list of 50+ Youtube courses on machine learning, deep learning, NLP, optimization, computer vision, statistical learning etc. You may want to bookmark this page for quick reference and access to these courses. This page will be updated from time-to-time. Enjoy learning! Course title Course type URL MIT 6.S192: Deep Learning for Art, Aesthetics, and Creativity Deep learning https://www.youtube.com/playlist?list=PLCpMvp7ftsnIbNwRnQJbDNRqO6qiN3EyH AutoML – Automated Machine Learning AutoML https://ki-campus.org/courses/automl-luh2021 Probabilistic Machine Learning Machine learning https://www.youtube.com/playlist?list=PL05umP7R6ij1tHaOFY96m5uX3J21a6yNd Geometric Deep Learning Geometric deep learning https://www.youtube.com/playlist?list=PLn2-dEmQeTfQ8YVuHBOvAhUlnIPYxkeu3 CS224W: Machine Learning with Graphs Machine learning  https://www.youtube.com/playlist?list=PLoROMvodv4rPLKxIpqhjhPgdQy7imNkDn MIT 6.S897 Machine Learning for Healthcare Machine learning https://www.youtube.com/playlist?list=PLUl4u3cNGP60B0PQXVQyGNdCyCTDU1Q5j Deep Learning and Combinatorial Optimization Deep …

Continue reading

Posted in Career Planning, Data Science, Deep Learning, Machine Learning, Tutorials. Tagged with , , , , .

Online AI News from Top Global Universities – List

US universities ai news and events

In this post, you will get an access to a list of web pages representing latest news related to artificial intelligence from top universities across the globe. This page will be updated from time-to-time for including new pages from different universities across the globe. These URLs will be very useful for those machine learning / data science enthusiasts who want to keep tab on current news and events in the field of artificial intelligence. MIT Stanford Stanford university – Human-centered AI (HAI) Stanford university – Center for AI in medicine and imaging Stanford AI research and ideas Harvard university JHU Malone center for Engg. in healthcare Yale university Princeton university …

Continue reading

Posted in AI, Data Science. Tagged with , .

MOSAIKS for creating Climate Change Models

MOSAIKS models comparison with Resnet and pre-trained CNN models

In this post, you will learn about the framework, MOSAIKS (Multi-Task Observation using Satellite Imagery & Kitchen Sinks) which can be used to create machine learning linear regression models for climate change. Here is the list of few prediction use cases which has already been tested with MOSAIKS and found to have high model performance: Forest cover Elevation Population density Nighttime lights Income Road length Housing price Crop yields Poverty mapping What is MOSAIKS? MOSAIKS provides a set of features created from Satellite imagery dataset. We are talking about 90TB of data gathered per day from 700+ satellites. These features can be combined with machine learning algorithms to address global …

Continue reading

Posted in AI, Climate Change, Data Science, Machine Learning. Tagged with , .

Machine Learning for predicting Ice Shelves Vulnerability

ice shelves machine learning

In this post, you will learn about usage of machine learning for predicting ice shelves vulnerability. Before getting into the details, lets understand what is ice shelves vulnerability and how it is impacting global warming / climate change. What are ice shelves? Ice shelves are permanent floating sheets of ice that connect to a landmass. Most of the world’s ice shelves hug the coast of Antarctica. Ice from enormous ice sheets slowly oozes into the sea through glaciers and ice streams. If the ocean is cold enough, that newly arrived ice doesn’t melt right away. Instead it may float on the surface and grow larger as glacial ice behind it continues to flow into the …

Continue reading

Posted in Climate Change, Data Science, Machine Learning. Tagged with , .

Top Data Sources for Climate Change Research

climate change data sources

In this post, you will get to learn about top data sources online from where you can learn and get data for doing research on climate change. Vitalflux is committing itself to AI and climate change research for next 15 years. You will get to learn about climate change and how data science / machine learning can be leveraged to tackle climate change in time to come.   Without further ado, lets list down the data sources related to climate change research: United Kingdom’s Met Office Hadley Centre: Researchers at the Met Office Hadley Centre produce and maintain a range of gridded datasets of meteorological variables for use in climate monitoring and climate …

Continue reading

Posted in Climate Change. Tagged with , .

Python Scraper for GoogleNews, Twitter, Reddit & Arxiv

Python scraper GoogleNews Twitter Reddit Arxiv

In this post, you will get the Python code for scraping latest and greatest news about any topics from Google News, Twitter, Reddit and Arxiv. This could prove to be very useful for data scientist, machine learning enthusiats to keep track of latest and greatest happening in the field of artificial intelligence. If you are doing some research work, these pieces of code would prove to be very handy to quickly access the information. The code in this post has been worked out in Google Colab notebook. First and foremost, import the necessary Python libraries such as the following for GoogleNews, Twitter and Arxiv.  Python Code for mining GoogleNews Here …

Continue reading

Posted in Data Science, Python. Tagged with .

Reddit Scraper Code using Python & Reddit API

Reddit app client id and secret token

In this post, you will get Python code sample using which you can search Reddit for specific subreddit posts including hot posts. Reddit API is used in the Python code. This code will be helpful if you quickly want to scrape Reddit for popular posts in the field of machine learning (subreddit –  r/machinelearning), data science (subreddit – r/datascience), deep learning (subreddit – r/deeplearning) etc.   There will be two steps to be followed to scrape Reddit for popular posts in any specific subreddits. Python code for authentication and authorization Python code for retrieving the popular posts Check the Reddit API documentation page to learn about Reddit APIs. Python code for …

Continue reading

Posted in Python. Tagged with .

Mining Twitter Data – Python Code Example

Twitter data mining with Python Twitter API

In this post, you will learn about how to get started with mining Twitter data. This will be very helpful if you would like to build machine learning models based on NLP techniques.  The Python source code used in this post is worked out using Jupyter notebook. The following are key aspects of getting started with Python Twitter APIs.  Set up Twitter dev app and Python Twitter package Establish connection with Twitter Twitter API example – location-based trends, user timeline, etc Search twitter by hashtags Setup Twitter Dev App & Python Twitter Package In this section, you will learn about the following two key aspects before you get started with …

Continue reading

Posted in Data Mining, Python. Tagged with , .

Spend Analytics – 5 Ws of Spend Analysis

spend analytics

In this post, you will learn about 5 Ws of spend analytics. In case you are a procurement professional looking to understand use cases related to spend analytics, you may find this post to be very useful. In simple words, spend analytics is about extracting insights from spend in different procurement categories.  What are we spending on? First and foremost, it is important to get visibility on what items are we spending on. This can be achieved using a dashboard. This form of analytics is also called descriptive analytics. Analyzing item spends can be termed as Item spend analytics. The items can be related to direct or indirect procurement. Indirect …

Continue reading

Posted in Analytics, Data Science, Procurement.

Python Scraper Code to Search Arxiv Latest Papers

python arxiv library

In this post, you will learn about Python source code related to search Arxiv for relevant and latest machine learning and data science research papers. If you are looking for a faster way to research on Arxiv papers without really going to the Arxiv website, you may want to get this piece of code in your kitty. You can further automate the Arxiv search to get notified based on some logic. Without further ado, let’s get started.  Step 1: Install Python Arxiv Library As a first step, install the Python Arxiv library using the code such as below in your Jupyter notebook or Google colab instance: Step 2: Execute the …

Continue reading

Posted in Python. Tagged with .