Create animated histograms using R
- Transfer
Animated histograms that you can embed directly in a publication on any site are becoming increasingly popular. They display the dynamics of changes in any characteristics for a certain time and do it clearly. Let's see how to create them using R and universal packages.
Skillbox recommends: The Python Developer from scratch hands-on course .
We remind you: for all readers of “Habr” - a discount of 10,000 rubles when registering for any Skillbox course using the “Habr” promo code.
Packages
We need packages in R:
- ggplot2
- gganimate
These two are essential. In addition, tidyverse, janitor, and scales are required to manage data, clear the array, and format, respectively.
Data
The original data set that we will use in this project is downloaded from the World Bank website. Here they are - WorldBank Data . The same data, if you need them in finished form, can be downloaded from the project folder .
What is this information? The sample contains the GDP of most countries over several years (from 2000 to 2017).
Data processing
We will use the code below to prepare the required data format. We clear the column names, turn the numbers into a number format, and convert the data using the gather () function. All that is received is stored in gdp_tidy.csv for future use.
library(tidyverse)
library(janitor)
gdp <- read_csv("./data/GDP_Data.csv")
#select required columns
gdp <- gdp %>% select(3:15)
#filter only country rows
gdp <- gdp[1:217,]
gdp_tidy <- gdp %>%
mutate_at(vars(contains("YR")),as.numeric) %>%
gather(year,value,3:13) %>%
janitor::clean_names() %>%
mutate(year = as.numeric(stringr::str_sub(year,1,4)))
write_csv(gdp_tidy,"./data/gdp_tidy.csv")
Animated Bar Graphs
Their creation requires two stages:
- Building a complete set of relevant histograms using ggplot2.
- Animating static histograms with desired parameters using gganimate.
The final step is rendering the animation in the desired format, including GIF or MP4.
Loading libraries
- library (tidyverse)
- library (gganimate)
Data management
At this step, you need to filter the data to get the top 10 countries of each year. Add a few columns that allow you to display the legend for the histogram.
gdp_tidy <- read_csv("./data/gdp_tidy.csv")
gdp_formatted <- gdp_tidy %>%
group_by(year) %>%
# The * 1 makes it possible to have non-integer ranks while sliding
mutate(rank = rank(-value),
Value_rel = value/value[rank==1],
Value_lbl = paste0(" ",round(value/1e9))) %>%
group_by(country_name) %>%
filter(rank <=10) %>%
ungroup()
Building Static Histograms
Now that we have the data packet in the desired format, we begin drawing the static histograms. Basic information - top 10 countries with maximum GDP for the selected time interval. We build charts for each year.
staticplot = ggplot(gdp_formatted, aes(rank, group = country_name,
fill = as.factor(country_name), color = as.factor(country_name))) +
geom_tile(aes(y = value/2,
height = value,
width = 0.9), alpha = 0.8, color = NA) +
geom_text(aes(y = 0, label = paste(country_name, " ")), vjust = 0.2, hjust = 1) +
geom_text(aes(y=value,label = Value_lbl, hjust=0)) +
coord_flip(clip = "off", expand = FALSE) +
scale_y_continuous(labels = scales::comma) +
scale_x_reverse() +
guides(color = FALSE, fill = FALSE) +
theme(axis.line=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
legend.position="none",
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.grid.major.x = element_line( size=.1, color="grey" ),
panel.grid.minor.x = element_line( size=.1, color="grey" ),
plot.title=element_text(size=25, hjust=0.5, face="bold", colour="grey", vjust=-1),
plot.subtitle=element_text(size=18, hjust=0.5, face="italic", color="grey"),
plot.caption =element_text(size=8, hjust=0.5, face="italic", color="grey"),
plot.background=element_blank(),
plot.margin = margin(2,2, 2, 4, "cm"))
Building graphs using ggplot2 is quite simple. As you can see in the code section above, there are a few key points with the theme () function. They are necessary so that all elements are animated without problems. Some of them may not be displayed if necessary. Example: only vertical grid lines and legends are drawn, but the headings of the axes and a few more components are removed from the site.
Animation
The key function here is transition_states (), it glues separate static graphs. view_follow () is used to draw grid lines.
anim = staticplot + transition_states(year, transition_length = 4, state_length = 1) +
view_follow(fixed_x = TRUE) +
labs(title = 'GDP per Year : {closest_state}',
subtitle = "Top 10 Countries",
caption = "GDP in Billions USD | Data Source: World Bank Data")
Rendering
After the animation is created and saved in the anim object, it's time to visualize it using the animate () function. The renderer used in animate () may vary depending on the type of output file required.
GIF
# For GIF
animate(anim, 200, fps = 20, width = 1200, height = 1000,
renderer = gifski_renderer("gganim.gif"))
Mp4
# For MP4
animate(anim, 200, fps = 20, width = 1200, height = 1000,
renderer = ffmpeg_renderer()) -> for_mp4
anim_save("animation.mp4", animation = for_mp4 )
Result
As you can see, nothing complicated. The whole project is available in my GitHub , you can use it as you see fit.
Skillbox recommends:
- Two-year practical course "I am a PRO web developer . "
- Online course "C # developer from scratch . "
- Practical annual course "PHP-developer from 0 to PRO" .