Vacation Where? When? R

Data collection
The data sources were data from weather stations located all over the world, the results of which are also provided to the National Oceanic and Atmospheric Administration (NOAA ), on the website of which there is an archive of readings of these stations from 1901 on all ( more than 30,000) stations existing during this time, at the moment there are about 14,000 actual stations. Access to the data is via ftp , there are a lot of different data (there are average daily and more frequent data, data in each file and vary by temperature, humidity, precipitation, wind, etc.), I will only use data on the average daily temperature.
The data for each year for each station is an archived text file with delimiters. The total number of files on this resource is about 700,000, you could download them all, the approximate download time would be about two days and would require about 200 GB of disk space, but I don’t see the need for it, since the data download time is one unit ( one station in one year) is less than 0.2 seconds, so for the vast majority of requests (5-10 stations, for 5-10 years), the waiting time is no more than a minute, so access is online. Each station has a synoptic index (unique code), name, and coordinates. Unfortunately, the names of the stations are not always informative, but in the vast majority of cases it is the nearest settlement, in other cases it is located in the vicinity of the airport. Many stations have been closing, or vice versa, recently opened, so there may be passes for some years. After selecting the time interval of interest, a selection is performed for a given range, diagrams are built, and averaged data is displayed on the globe and on the map.
Display on globe and map
The threejs package is used to display data on a spinning globe . For rendering, it is possible to use a graphic format - a jpeg file (in geographic projection on an ellipsoid WGS84 ) or by generating an object. In this case, a ready-made jpeg file is used - the landscape cover of the earth's surface of the Blue Marble Next Generation data set from the NASA websitefrom August 2004 (resolution 8 km per pixel). As a result, since one parameter can be displayed on the globe, the average (median) temperature is displayed on it (both by years and by the interval of days). On a globe for a specific point, a column of a certain color, height and thickness is displayed, in my case all these parameters are used to display the average temperature for all years over the entire time interval of interest with respect to all selected stations, scaling is used in the case of column sizes and the transition in color from blue to red for temperatures (i.e., for stations with average temperatures (-10 °, -5 °, + 7 °, + 10 °, + 30 °), a blue column with a minimum height and thickness will be for a weather station with -10 °, a red column from Maximum Feed-height thickness for a weather station to be + 30 °,
In addition to these bars, the curves coming from one point (Moscow) are also indicated, they show the trajectories to the selected stations, the thickness and color of these curves are also scaled relative to the temperatures. The display of these curves clearly shows both the distance and the average temperature. Also, in addition to the data on the globe (which can be rotated and enlarged), this information (average temperature, distance, and colors used) is shown in the adjacent table (Fig. 1). In addition to the globe, the same averaged data is shown on a flat Google map ( Map tab ).
Charts, static and interactive
The graphical web interface is traditionally the

Fig. 1. The main window
In addition to the data on the first tab (on the globe) (which are averaged and do not carry additional information), it is more interesting to see the spread and dynamics of temperatures both by years and relative to each other, for this the following diagrams (by tabs) are used , all panel diagrams:
- Time series (Fig. 2)- ordinary time series, where on each panel (weather station) the archived years are colored by color, and the black bold curve is averaged over these years (the figure shows an interactive diagram, by hovering it shows the legend by historical temperature (sorted) for each day)

Fig. 2 . Time series diagram
- Boxplots, span diagram (Fig. 3) - in this case, every day is considered atomic and this diagram displays the scatter of values for a specific date for all selected years

Fig.3. Span chart (boxplots, box with a mustache)
- Density (Fig. 4)- a diagram of temperature densities for selected stations, where the years are already indicated by panels, it shows both the ratio of temperatures by stations and the dynamics in time (by years)

Fig. 4. Density diagram
In addition to the static ggplot2 diagrams, by checking the “interactive diagrams” checkbox , the diagrams are converted into interactive ones, which show a legend on hover, the diagrams themselves can be increased, reduced, or moved axes. To do this, we use the translation of existing ggplot2 objects into direct interactive display using the plotly package .
Preservation
To save all the results, it is possible to unload all the diagrams, tables in html and / or docx format ( Report tab ). To do this, select the required format and save the file. It uses integration with markdown , for this we use the Rmarkdown mock-up file , which shows both plain text and calling functions R.
Conclusion
As a result, because of my desire to get a "statistical advantage" in choosing the optimal vacation days, I got a tool on which you can look at the historical temperatures of any interval of days almost anywhere in the world, and evaluate if there is any global-local warming . Traditionally, thanks to R, all this was implemented quickly and easily.