If you’re a data scientist, data analyst or a Python programmer, data visualization is key part of your job. And what better way to visualize all that juicy data than with a scatter plot? Matplotlib is your trusty Python library for creating charts and graphs, and in this blog we’ll show you how to use it to create beautiful scatter plots using examples and with the help of Matplotlib library. So dig into your data set, get coding, and see what insights you can uncover!
A scatter plot is a type of data visualization that is used to show the relationship between two variables. Scatter plots are used in data science and statistics to show the distribution of data points, and they can be used to identify trends and patterns. Scatter plots are a type of graph that shows the scatter plot for data points. These plots are created by using a set of X and Y-axis values. The X-axis can be used to represent one of the independent variables, while the Y-axis can be used to represent the other independent variables or dependent variable. Scatter plots can be used for the following:
The following represents a sample scatter plot representing three different classes / species for IRIS flower data set. X-axis represents an attribute namely sepal length and Y-axis represents the attribute namely sepal width.
The following is a simple scatter plot created using Matplotlib library.
from matplotlib import pyplot as plt
import numpy as np
X = np.array([1, 2, 3, 4, 5, 6, 7])
Y = X
plt.figure()
plt.scatter(X, Y)
Here is the plot which gets created as a result of implementing above code:
Here is another example representing how scatter plot can be used to classify the data set across different classes.
import panda as pd
df2 = pd.read_csv('/Users/apple/Downloads/user knowledge level - Sheet1.csv')
df2.head()
The code below can be used to scatter plot the classes such as very_low and Low while using the feature STG and SCG as X and Y axis. Make a note of how scatter function is invoked multiple times for plotting different data points that satisfies the given conditions. Thus, you could invoke scatter plot multiple times to plot different types of data points. This will turn out to be useful when you are dealing with classification machine learning problem having data points related to different labels / classes.
plt.scatter(df2['STG'][(df2.UNS == 'very_low') | (df2.UNS == 'Very Low')],
df2['SCG'][(df2.UNS == 'very_low') | (df2.UNS == 'Very Low')],
marker='D',
color='red',
label='Very Low')
plt.scatter(df2['STG'][df2.UNS == 'Low'],
df2['SCG'][df2.UNS == 'Low'],
marker='o',
color='blue',
label='Low')
plt.xlabel('STG')
plt.ylabel('SCG')
plt.legend()
plt.show()
The above scatter plot could be achieved in one line by using category_scatter function from mlxtend python package authored by Dr. Sebastian Raschka. Here is the command:
from mlxtend.plotting import category_scatter
df['UNS'] = np.where(df['UNS'] == 'Very Low', 'very_low', df['UNS'])
fig = category_scatter(x='STG', y='SCG', label_col='UNS',
data=df, legend_loc='upper right')
That’s all for now on scatter plots. If you have any questions, please don’t hesitate to let us know in the comments section below. We love hearing from our readers and we try to answer every question as best we can. And if you want to learn more about data visualization with Python programming, be sure to check out our other tutorials. Thanks for reading! Happy plotting!
In recent years, artificial intelligence (AI) has evolved to include more sophisticated and capable agents,…
Adaptive learning helps in tailoring learning experiences to fit the unique needs of each student.…
With the increasing demand for more powerful machine learning (ML) systems that can handle diverse…
Anxiety is a common mental health condition that affects millions of people around the world.…
In machine learning, confounder features or variables can significantly affect the accuracy and validity of…
Last updated: 26 Sept, 2024 Credit card fraud detection is a major concern for credit…