Multidimensional graphics in Python - from three-dimensional to six-dimensional
- Transfer
- Tutorial
Introduction
Visualization is an important part of data analysis, and the ability to look at multiple dimensions at the same time makes this task easier. In the tutorial we will draw graphs up to 6 dimensions.
Plotly is an open source Python library for a variety of visualizations that offers much more customization than the famous matplotlib and seaborn . The module is installed as usual - pip install plotly . We will use it for drawing graphs.
Let's prepare the data
For visualization, we use simple data about cars from UCI (University of California, Irvine - approx. Transl.) , Which represent 26 characteristics for 205 cars (26 columns for 205 rows). To visualize six dimensions, we take these six parameters.
Only 4 lines out of 205 are shown here.
Download data from CSV using pandas .
import pandas as pd
data = pd.read_csv("cars.csv")
Now, having prepared, let's start with two dimensions.
Two-dimensional scatterplot
A scatterplot is a very simple and common plot. Of the 6 parameters, price and curb-weight are used below as Y and X, respectively.
# Импорт необходимых модулей
import plotly
import plotly.graph_objs as go
# Создаём figure
fig1 = go.Scatter(x=data['curb-weight'],
y=data['price'],
mode='markers')
# Создаём layout
mylayout = go.Layout(xaxis=dict(title="curb-weight"),
yaxis=dict( title="price"))
# Строим диаграмму и сохраняем HTML
plotly.offline.plot({"data": [fig1],
"layout": mylayout},
auto_open=True)
In plotly, the process is slightly different from the same in Matplotlib. We must create a layout and figure , passing them to the offline.plot function , after which the result will be saved in an HTML file in the current working directory. Here is a screenshot of what happens. At the end of the article there will be a link to the GitHub repository with ready-made interactive HTML-graphics.
Two-dimensional scatterplot
3D scatter plot
We can add a third horsepower parameter (amount of horsepower) to the Z axis . Plotly provides a Scatter3D function for building interactive 3D graphs.
3D graph
Instead of pasting the code here every time, I added it to the repository.
(It is most convenient to look at the relevant code in an adjacent tab in parallel with reading - approx. Transl.)
Adding a fourth dimension
We know that you cannot use more than three dimensions directly, but there is a workaround: we can emulate depth to visualize higher dimensions using color, size or shape.
Here, along with the three previous features, we will use the mileage in city conditions - o city-mpg as the fourth dimension, which will be responsible for setting markercolor function Scatter3D . A lighter shade of the marker will mean less mileage.
It is immediately evident that the higher the price, the number of horses and the mass, the lower the mileage.
4D graph with shades as the 4th dimension
Adding a fifth dimension
Marker size can be used to visualize the 5th dimension. We use the characterization engine-size bed (engine size) parameter markersize function Scatter3D .
Observations: engine size is related to some of the previous parameters. The higher the price, the larger the engine. As well as: lower mileage - more engine.
5D plot with marker value as fifth dimension (motor size)
Adding the Sixth Dimension
The shape of the marker is great for visualizing categories. Plotly gives you a choice of 10 different shapes for 3D graphics (asterisk, circle, square, etc.). Thus, up to 10 different values can be shown as a form.
We have the characteristic num-of-doors , which contains integers - the number of doors (2 or 4). We transform these values into figures: a square for 4 doors, a circle for 2 doors. The markersymbol parameter of the Scatter3D function is used .
Observations: it feels like all the cheapest cars have 4 doors (circles). By continuing to study the schedule, more assumptions and conclusions can be made.
6D graph with marker shape as the sixth dimension (number of doors)
Can we add more dimensions?
Of course we can! Markers have more properties, such as opacity and gradients, that can be enabled. But the more dimensions we add, the more difficult it is to keep them all in the head.
Source
Python code and interactive graphics for all shapes are available on GitHub here.