Analysis of the demand and supply of freelancers on the example of the exchange oDesk



    Introduction


    oDesk is the world's largest (well, at least that's what oDesk himself thinks) international freelance exchange. About oDesk on a habr already wrote quite a lot of times, for example here or here, where, in my opinion, it’s practically chewed why and for whom this resource was created, and especially “with what it is”. In this regard, the description stage and the principle of the site can be omitted. I’ll try to analyze the data both about freelancers themselves and about orders, customers and their requirements for freelancers - roughly speaking, what you need to be able to and know in order to be more or less in the subject of modern technologies. I will also analyze supply and demand based on data on the skills of freelancers and customer requirements. And of course, some statistics and some beautiful pictures as examples (who works on oDesk, where do orders come from, who earns more, and who works better, etc.). And all this is based on self-collected information, so openly and nobly provided by oDesk himself through the API. It is worth noting that an article with a small amount of statistics about oDesk had already managed to flash on the open spaces of the Habr, but in it, unlike the current article, oDesk praised itself and provided the results. In general, I want to say right away that I do not pretend to be a complete review, however, the data collection process will be briefly described below, which will allow you to conduct your own analysis if necessary.


    How data was collected


    When creating a new job, the client usually adds a list of the necessary skills for its implementation. Skills can be very different (both knowledge of programming languages, and knowledge of operating systems or common languages ​​like Russian or English). A complete list in alphabetical order can be found here , at the time of writing, about 2500 thousand skills. In addition, freelancers add their skills to the profile. Thus, I implemented my data collection on the basis of information about the work and freelancers for their skills. More precisely, first I used the API to get a list of all the skills (there is a special function for this ), and then in the cycle for all skills I got a list of jobs and freelancers. In fact, this method skips the work and freelancers who do not have a single skill, but we will just assume that we are not interested in such work since the client himself does not know what he wants, and freelancers who do not have skills or are lazy to add them to the description of their profile are unlikely to work well. The source code for a console application that collects data and stores it in a database I posted on GitHub .

    A little about the size of the base, which was able to collect. In total, about 500,000 freelancers, of which more than 200,000 completed at least one job, and about 150,000 worked at least one hour. More than 50,000 open work at the time of writing, with a description of each work on average includes about 5 skills.

    Unfortunately, with the help of the API, I did not get access to financial information, or rather how much clients actually spend and how much freelancers earn, however, some estimates can also be made by indirect signs, for example, information on the number of hours worked and the hourly rate of freelancers (although this not always true, as sometimes the hourly rate indicated in the profile may not correspond to the one at which the freelancer actually works).

    About countries


    First of all, briefly about countries, and more precisely about which countries have more work being created and which countries have more freelancers, and a little about how well this freelancers works and how much money they ask.

    The following picture shows the countries with the greatest number of works (a more intense color means a larger value, and signatures mean absolute values). The top 10 countries include: USA, Australia, United Kingdom, Canada, India, Philippines, Germany, Israel, Singapore, Pakistan in descending order of number of works.



    The following picture shows the countries with the most freelancers. The top 10 countries in decreasing order of total number of employees are: Philippines, India, USA, Bangladesh, Pakistan, Ukraine, Russia, United Kingdom, Canada, Romania.



    The following picture shows the top 20 countries in terms of total hours worked by freelancers in that country. Accordingly, the first column of the table shows the total number of hours, the second the number of freelancers themselves, the third the size of the hourly wage in US dollars, and the fourth rating of employees, which can take values ​​from 0 to 5 (a higher rating means that customers were more satisfied with the results). It is worth noting that for the hourly wage level and rating, the median was calculated as a more representative indicator in comparison with the average value. There was also the thought of calculating the arithmetic mean weighted, and using the total number of hours worked as weights, but I like this idea a little less than the median, as it reflects poorly recent changes. So, for example, if over the past six months there have been many new freelancers who work well and have a good rating, but have not yet worked many hours, their rating will not be properly considered.



    What is more in demand and what is not needed at all


    As described at the beginning, data was collected based on the skills specified in the job description and the skills that the freelancers themselves indicate. Now, based on this, we will try to conduct a small analysis of supply and demand.

    We will exclude part of the data from further consideration - these are the skills of freelancers for which there is no demand. For example, I am good at dancing, but I don’t understand how freelancers are going to sell this skill on oDesk. Funnily enough, 420 freelancers indicated dancing in their profiles, but it’s pretty predictable that the number of jobs requiring this skill is zero. In addition to dancing, sambo, mambo or baking skills are also often indicated.

    The data with the missing sentence look a little more interesting, mainly knowledge of languages: Norwegian (56 works), Bengali (16) and Latvian (15) and other languages. In general, we can say that for the rest of the clients' requests there are performers with the necessary skills.

    The following picture shows a visual demonstration of the 100 most demanded skills (which is about 5% of the total number of skills, taking into account discarded due to lack of demand or supply) found in the job descriptions. Numeric values ​​correspond to the absolute value of the number of jobs in which the skill occurs.



    The following picture shows the 100 most common skills in the profiles of freelancers (designations are similar to those used above).



    It is logical that a skill is of interest if demand for it greatly exceeds supply and demand is also quite significant. That is, the importance of the skill depends on two criteria:
    • 1st criterion: the ratio of demand to supply.
    • 2nd criterion: frequent occurrence of skill in customer requirements. This criterion is based on the fact that if there are two skills A and B , A has 100 jobs (demand) and 1000 freelancers (supply), B has 200 jobs and 2000 freelancers, then B will still be more preferable to study because of the larger the variety of jobs to choose from, despite the same ratio of supply to demand for both skills.

    The proposed idea can be expressed by the following formula for assessing skill:

    That is, more interesting skills are skills with a larger value of R.

    Weight coefficient α is introduced to allow the significance to be varied between the 1st and 2nd criteria. If you try to achieve the same average value of both criteria, then the coefficient α can be taken equal to the following value:

    For a more detailed analysis of skills, you need to focus on each subcategory separately, but now we will focus on web programming. With the approach described above, the first 30 skills with the highest value of the coefficient R will be as follows (the R values ​​are postponed along the X axis):



    To be honest, this result is a little confusing to me at the first examination, since the listed skills relate to completely different things, something to frameworks, something to some abstract concepts, 5 out of 30 to knowledge of ordinary languages ​​(English, Chinese, German, French, and even English grammar), 3 of 30 to mobile development for iOS and Android. In addition, oDesk includes “Content Writing”, and “Technical writing”, and “Creative writing”, and “Article Writing”, and “Blog Writing”, and “Business Writing”, and “Editorial Writing” in the “Writing” skill. " The “research” skill is also quite extensive and implies both “Research” itself and “Internet research”, and “Market research”, and “SEO Keyword Research”, the skill “English” also predictably includes other skills, including the same ability to translate from English into other languages,

    For some specificity, I decided to independently select from the general list of skills that relates to programming languages ​​and apply the ranking described above to them, the result is the following (the X values ​​are also delayed by the R value):



    It may seem that the offer greatly exceeds demand, However, it should be borne in mind that the data collected includes data about all freelancers (and those who previously worked, but just did not delete their profile, and those who have work at the moment, and those who are actively looking for new projects), but showing work They are only active at the moment and when they find an artist they usually close, which does not allow us to say exactly how many freelancers are really looking for work at the moment.

    Regarding the picture above, again, it’s worth making a clarification, since it is clear that the C # language takes only 6th place with the ranking used, but it is clear that ASP technology got into the top 30 (and to be more precise, this skill basically implies ASP .NET or ASP.NET MVC), which is based on the use of C # or other languages ​​included in the .NET Framework. But at the same time, technologies related to the Ruby, PHP, JavaScript, Python and Java languages ​​that overtake C # did not get into the top 30. Although, on the other hand, for other languages ​​there is a greater variety of frameworks, in contrast to C #, for which the choice is not great. In general, all this only leads to the idea of ​​conducting a separate analysis of the technologies used, for example, to create web applications, or comparing other skills from a specific area.



    Well, as a last example, CMS comparison:



    Of course, a general comparison of all skills is also of interest, but for the adequacy of the results it is better to compare skills from one area, rather than knowing Chinese with knowledge of bootstrap as it turned out a little higher (a bar chart with yellow colors and top 30 skills).

    Conclusion


    During the analysis, some difficulties arose. For example, the fact that the API when searching for skills does not perform an exact search, but something like a substring search, which leads to the appearance of C # or Objective-C when searching for C, although these are different programming languages ​​used for completely different tasks. There is no desire to check all 2500 skills manually, so I can’t say for sure how many such inaccuracies in the API work.

    In addition, the analysis affects data only at the time of writing and does not take into account trends that would allow us to say, for example, the demand / offer for which programming languages ​​is increasing and for which it is falling on the oDesk exchange.

    If this topic is interesting, I’ll think about clarifying data collection and how to analyze trends.

    Also popular now: