An analysis of the hh.ru resume: lots of graphs and a bit of sexism and discrimination
Recently I came across an article about the analysis of the hh.ru resume dataset, which participated in some hackathon. This led me to think about playing with these resumes myself. Moreover, I have a little more of them. I chose the most interesting professional area for me, which can be indicated in the resume - “Information Technologies, Internet, Telecom”.
Under the cut, you will find many graphs on which you will find out how much people in various IT specialties receive, graduates of which universities want the most money, which employers have the most latency, whether users earn Google mail more than Yandex or Mail mail users, and a lot of other information.
Everything described below is just my vision of how it should be. Charts do not claim to be completely objective and reflect the real situation. I could make a mistake anywhere.
I took Russian resumes from the professional field “Information Technologies, Internet, Telecom”, which were updated over the past year. Further, for all graphs, it should be borne in mind that the indicators below are not an absolute cut across the country, but only for the part that is present on hh.ru. It may be biased.
In total, 566 178 resumes of IT specialists participate in the analysis. All graphs are clickable.
In the professional field "Information Technologies, Internet, Telecom" you can choose up to three specializations. The graph shows how many people choose which specialization:
When creating a resume, you can omit your patch. As you can see, about 40% of IT people use this.
The superiority of men in IT will not surprise anyone. Why are there so few women - system administrators, networkers, and leading techies?
Well, now let's see how much money men and women in different specializations throughout Russia want.
The next chart is a boxplot, or, in our opinion, a box with a mustache. It reads like this. The line inside the box is the median. That is, half of the people receive more than this amount, and the other half - less. I tried to sign the median value on almost all charts. The box is an interquartile range (IQR) and includes 50% of all resumes: from the 1st quartile (25%) to the 3rd quartile (75%). In other words, 25% of the resumes want less money than the left border of the box, and 25% want more money than the right border of the box. Mustache, on the other hand, limits almost all other data: 0.35% of all resumes want less money than the left border, and 0.35% want more than the border of the right mustache. All who did not enter the indicated intervals are outliers and are marked by separate points.
It's no secret that in Moscow and St. Petersburg there are more IT specialists than in other regions of our country, and the level of payment there is higher than the national average. Therefore, I made separate schedules for these cities with the distribution of salaries by specialization.
On all three graphs, the same expectations for monetary compensation between men and women can be seen among testers. By the way, girls are more eager to enter this IT area than to many others.
It is worth noting that in Moscow and St. Petersburg, the female median of technical superiors is higher. But if you look at the distribution chart by the number of men and women in this specialty, you can see that there are much fewer second ones.
We see that IT salaries are second only to the extraction of raw materials, consulting and top management. In 24 out of 28 professional regions, women want less money (median). In the rest - equally.
Once again, I’ll clarify that the graph does not reflect the actual distribution of IT specialists by country, but only those resumes that are on hh.ru. The level of use of the site in the regions is different.
For this schedule, I received all active vacancies and resumes that were updated during the year, and for each region I divided the number of resumes by the number of vacancies. I also excluded regions where the number of IT people is less than 1000. It can be seen that the Moscow Region has become a leader. It does not include Moscow itself. Most likely, this is because many IT people, especially visitors, are settled on the periphery, and work is mostly in the city itself.
In the next graph, I calculated what percentage of the total number of resumes in this region is occupied by IT specialists. An interesting difference between St. Petersburg and the Leningrad region and Moscow and the Moscow region. Most likely, this is due to the presence in the Moscow Region of such large dens for IT specialists as Mytishchi, Khimki, Lyubertsy and others, which belong to the region, but are close to the city.
In the resume, you can indicate key skills. The following graph shows the top of the selected skills for all IT people.
It’s interesting to see the key core skills for developers only.
Young people often go to the web and toys. I think this is a great entry point to IT.
Only those who have worked for more than 20 years violate the order. Most likely, this is because in this category there are many who came to IT from another field. Since people often like to indicate in their resumes not only experience relevant to this professional field.
Most indicate that the time taken to get to work does not matter. In St. Petersburg and Moscow, people are somewhat more aware of this "does not matter" and therefore less often choose this item.
In the summary, you can indicate the nearest metro station. Let's see for Moscow, where the most people. I did not find a simple way to put a text label in the Python gmap, so the stations marked with markers are indicated separately: Most of the indicated stations are the main entry points into the city from crowded places.
I downloaded the database of DEF codes on the site of Rossvyaz, combed it a little and mapped it with the phone numbers from the resume.
For this graph, I combined the various domains of one company into one group. By the way, a curious fact, which is not visible on the chart, is about Yandex.Mail, that the vast majority indicate in the email address the yandex.ru domain, not ya.ru. I always thought that the coolest dudes use Gmail, the average dudes use Yandex, and the rest use Mail.ru. Now let's see what salary expectations will be for these three groups. So guys, if you want more money - you know what to do.
Well, now let's see which university graduates want the most money. I eliminated all institutions where less than 1000 people from the sample studied.
In the work experience, you can specify the duration of work in this company. I took everything through which more than 500 IT specialists passed. It can be seen that not very IT organizations come across. This is because in the work experience not only relevant experience is indicated.
Let's see, people with experience in which companies want the most money. I’ll take Moscow and Peter separately.
Moscow: St. Petersburg: In the process of drawing graphs, I had more and more new ideas that could be done more, but I decided to dwell on what was. If this post comes in well, I will continue. In drawing graphs I was helped by: Python , Jupyter notebook , Pandas , Seaborn , Apache Hive and others. Ask questions. Thanks to all. UPD : I cleaned the last three graphs a little, combining different spellings of one organization
Under the cut, you will find many graphs on which you will find out how much people in various IT specialties receive, graduates of which universities want the most money, which employers have the most latency, whether users earn Google mail more than Yandex or Mail mail users, and a lot of other information.
Everything described below is just my vision of how it should be. Charts do not claim to be completely objective and reflect the real situation. I could make a mistake anywhere.
I took Russian resumes from the professional field “Information Technologies, Internet, Telecom”, which were updated over the past year. Further, for all graphs, it should be borne in mind that the indicators below are not an absolute cut across the country, but only for the part that is present on hh.ru. It may be biased.
In total, 566 178 resumes of IT specialists participate in the analysis. All graphs are clickable.
Number of people in specialization
In the professional field "Information Technologies, Internet, Telecom" you can choose up to three specializations. The graph shows how many people choose which specialization:
How many people hide the desired salary
When creating a resume, you can omit your patch. As you can see, about 40% of IT people use this.
Distribution of men and women by specialization
The superiority of men in IT will not surprise anyone. Why are there so few women - system administrators, networkers, and leading techies?
Distribution of desired salary by specialization
Well, now let's see how much money men and women in different specializations throughout Russia want.
The next chart is a boxplot, or, in our opinion, a box with a mustache. It reads like this. The line inside the box is the median. That is, half of the people receive more than this amount, and the other half - less. I tried to sign the median value on almost all charts. The box is an interquartile range (IQR) and includes 50% of all resumes: from the 1st quartile (25%) to the 3rd quartile (75%). In other words, 25% of the resumes want less money than the left border of the box, and 25% want more money than the right border of the box. Mustache, on the other hand, limits almost all other data: 0.35% of all resumes want less money than the left border, and 0.35% want more than the border of the right mustache. All who did not enter the indicated intervals are outliers and are marked by separate points.
It's no secret that in Moscow and St. Petersburg there are more IT specialists than in other regions of our country, and the level of payment there is higher than the national average. Therefore, I made separate schedules for these cities with the distribution of salaries by specialization.
Salary distribution by specialization in Moscow
Salary distribution by specialization in St. Petersburg
On all three graphs, the same expectations for monetary compensation between men and women can be seen among testers. By the way, girls are more eager to enter this IT area than to many others.
It is worth noting that in Moscow and St. Petersburg, the female median of technical superiors is higher. But if you look at the distribution chart by the number of men and women in this specialty, you can see that there are much fewer second ones.
Salaries of IT specialists in Russia in comparison with other professional areas
We see that IT salaries are second only to the extraction of raw materials, consulting and top management. In 24 out of 28 professional regions, women want less money (median). In the rest - equally.
Distribution of IT specialists by region
Once again, I’ll clarify that the graph does not reflect the actual distribution of IT specialists by country, but only those resumes that are on hh.ru. The level of use of the site in the regions is different.
How many resumes fall on one vacancy
For this schedule, I received all active vacancies and resumes that were updated during the year, and for each region I divided the number of resumes by the number of vacancies. I also excluded regions where the number of IT people is less than 1000. It can be seen that the Moscow Region has become a leader. It does not include Moscow itself. Most likely, this is because many IT people, especially visitors, are settled on the periphery, and work is mostly in the city itself.
Percentage of IT resumes from the total number of resumes
In the next graph, I calculated what percentage of the total number of resumes in this region is occupied by IT specialists. An interesting difference between St. Petersburg and the Leningrad region and Moscow and the Moscow region. Most likely, this is due to the presence in the Moscow Region of such large dens for IT specialists as Mytishchi, Khimki, Lyubertsy and others, which belong to the region, but are close to the city.
Distribution of desired salary in IT by region
The most popular key skills in IT
In the resume, you can indicate key skills. The following graph shows the top of the selected skills for all IT people.
Key skills for the specialization "Programming, development"
It’s interesting to see the key core skills for developers only.
Key Skills for Career Start Specialization
Key Skills Expectations
Age distribution by specialization in IT
Young people often go to the web and toys. I think this is a great entry point to IT.
Dependence of salary on work experience
Only those who have worked for more than 20 years violate the order. Most likely, this is because in this category there are many who came to IT from another field. Since people often like to indicate in their resumes not only experience relevant to this professional field.
Distribution of resumes by visibility status
Preferred travel time to work
Most indicate that the time taken to get to work does not matter. In St. Petersburg and Moscow, people are somewhat more aware of this "does not matter" and therefore less often choose this item.
Nearest metro station
In the summary, you can indicate the nearest metro station. Let's see for Moscow, where the most people. I did not find a simple way to put a text label in the Python gmap, so the stations marked with markers are indicated separately: Most of the indicated stations are the main entry points into the city from crowded places.
What mobile operators do IT people use
I downloaded the database of DEF codes on the site of Rossvyaz, combed it a little and mapped it with the phone numbers from the resume.
What kind of email do you use
For this graph, I combined the various domains of one company into one group. By the way, a curious fact, which is not visible on the chart, is about Yandex.Mail, that the vast majority indicate in the email address the yandex.ru domain, not ya.ru. I always thought that the coolest dudes use Gmail, the average dudes use Yandex, and the rest use Mail.ru. Now let's see what salary expectations will be for these three groups. So guys, if you want more money - you know what to do.
The distribution of the desired salary by university
Well, now let's see which university graduates want the most money. I eliminated all institutions where less than 1000 people from the sample studied.
In which companies do people work the longest
In the work experience, you can specify the duration of work in this company. I took everything through which more than 500 IT specialists passed. It can be seen that not very IT organizations come across. This is because in the work experience not only relevant experience is indicated.
Distribution of the desired salary in the presence of experience in the company
Let's see, people with experience in which companies want the most money. I’ll take Moscow and Peter separately.
Moscow: St. Petersburg: In the process of drawing graphs, I had more and more new ideas that could be done more, but I decided to dwell on what was. If this post comes in well, I will continue. In drawing graphs I was helped by: Python , Jupyter notebook , Pandas , Seaborn , Apache Hive and others. Ask questions. Thanks to all. UPD : I cleaned the last three graphs a little, combining different spellings of one organization