Amazon's crowdsourcing: how half a million people get a penny for AI training

Original author: Hope Reese and Nick Heath
  • Transfer

Internet platforms like Amazon Mechanical Turk allow companies to break work into small tasks and offer them to people from all over the world. Do they democratize work, or exploit the helpless?


Every morning after waking up, Kristy Milland launches her computer in Toronto, logs in to Amazon Mechanical Turk, and waits for the bell to ring.

Amazon Mechanical Turk (AMT), existing for over 10 years, is an online platform where people can do small tasks for money. Myland is looking for publications with task proposals — they are called “HIT” in the system — and notifications tell her when tasks meet her criteria. “Notifications come once a minute,” says Miland. “I look up from my business and see if it’s a good HIT before accepting a job offer.”

Sometimes there are group HIT tasks. “If a group is selected, but now it's lunch, or I have a visit to the doctor, or I have to walk my dog,” says Miland, “I drop everything and complete the task. I am tied to a computer. If only this way you can feed your children, you will not be able to leave. ” She has been doing this for 11 years.

Myland is one of more than 500,000 “Turkers” - contract workers performing small tasks on the Amazon digital platform, which they call mTurk. The number of active workers living around the world ranges from 15,000 to 20,000 monthly, according to Panos Ipeirotis, a computer science professor at the New York University Business School. Turckers work from a few minutes to 24 hours a day.

Who are the Turkers? According to Ipeirotis, in October 2016, American Turks were mostly women. In India, they are mostly men. If you take the whole planet, their birth years will lie in the interval from 1980 to 1990 . 75% are Americans, 15-20% are from India, 10% are from other countries.

Requisitioners - people, companies, organizations that give jobs to outsourcing - assign the cost of each task, and these tasks vary widely. Among them are:
• Categorization of data;
• Putting down meta tags;
• Character recognition;
• Data input;
• Collect emails;
• Semantic analysis;
• Advertising in video.

For example, one of the recent tasks of Miland was the transcription of the contents of the check. The company that ordered this job sells information to marketers and research departments from companies like Johnson & Johnson, P & G, etc. For this task, paid 3 cents (about 2 rubles).

Early Years AMT

Myland calls himself a resident of the digital world. “I met puberty with the internet,” she says. She says that she “always made money on the Internet,” using platforms like eBay to generate additional income. And when she stumbled upon an article about earning money after the launch of Amazon Mechanical Turk in 2005, it seemed like an ideal proposition. Iconography (eng) for the development of this type of work (clickable)

In the early years, Myland seemed to be more of an experiment than a real job. But during the crisis of 2008-2009 everything changed. Myland, who ran the kindergarten, had to move, and she lost income. At the same time, my husband lost his job. She began working with AMT full-time. For her, this means 17 hours a day every day. “We started to look at this as a job. And we started to ask from this how from work. ”

Rochelle Laplant [Rochelle LaPlante] from Los Angeles, has been working at AMT a full day since 2012. Laplant agrees with Myland that work is unpredictable. “It is not known when the task will be published. It could be 3am. And at 9 am there is nothing to do at all. ”

“I’m not as stubborn as some,” says Laplant, “because I value sleep.” Others, she said, put the notice. “If the requester publishes a request at 3 am, the computer emits a signal, the phone emits a signal, and they get up and do the work. They are subject to this schedule. "

Neither Miland, nor Laplant, have an “ordinary” working day. Usually they set themselves the bar for the amount of money earned. On a normal day, Laplant can work 8 hours. “But it's 10 minutes there, 20 minutes there - and it all accumulates,” she says.

And how much do average Turkers earn? It is hard to say. Adrien Jabour[Adrien Jabbour] from India says that "it can be considered an achievement if you earned $ 700 in two months of work, 4-5 hours a day." Myland says that recently in 8 hours she earned $ 25, and for her it was “not bad”. According to the Pew Research Center , slightly more than half of the Turkers earn less than the minimum wage set in the United States - $ 7.25 per hour.

Laplant talks about the difficult decisions that she had to make when choosing between work and life. “Did I have to decide whether to do work or go to a family dinner?” For people who live on this money, on the verge of eviction, such decisions are very difficult. ”

Advanced Turkers

The sad reality of people working in AMT is that not all Turks are created equal. The Amazon system assigns some workers to “experts” [Master's Level]. When a new requester places a HIT, the system automatically searches for the Turkers of this level. It costs more for the requester and brings more money to the employees. If you do not have this level, you will get less work. One of the weekdays in March, says Miland, the system had 4,911 tasks. She could choose 393 of them - only 8%.

But how to get the title of a specialist? Nobody knows. Myland saw how unqualified people — with a small number of completed tasks, low ratings, fake or suspended accounts — received this title. "There is no system in this," she says.

Amazon does not publish its criteria for achieving this level. There are different theories in the forums of the Turkers about getting a specialist level. Sometimes a set of tasks are published, and those who successfully cope with them receive this level. “It takes you to be in the right place at the right time,” says Myland.

In addition to the specialist level, there are regional restrictions. If you are not in the USA, this is bad - many requesters limit performers to this region.


“There are no two identical Turkers,” says Laplant. “Some feed on this, others earn pocket money.”

William Little is a moderator of TurkerNation, the online community of Turks from Ontario. He gets additional income from AMT. He strives to earn $ 15 a day for three hours of work. “In most cases, this is achievable,” he says, “and this is better than earning at the beginning of a career.” But payment is the main difficulty for many Turkers.

Now only Turkers from the USA and India receive money. Others, including Miland and Litla, receive Amazon gift cards.

Little travels by car 45 minutes to the US border, where he can get things with Amazon with free shipping, and collects his purchases. There are workarounds for those who want to get paid in money, but usually they are associated with a decrease in income. Different websites, such as, can convert gift cards, for example, to bitcoins.

“You post your wish list on I see him, and decide to buy some item. “I order and ship it to you,” says Little. - Bitcoins are stored in escrow. When you get a purchase, I get bitcoins. ” Then Little can sell them, get money through PayPal and transfer them to the bank. "I pay twice for translations, and it's not worth it."

Another problem is unpaid labor. Your work can be rejected without explanation. In addition, Turker spend time on the evaluation of work, looking for a requester's reputation. Download scripts, add tools, check statistics.

Computer slaves

Myland and Laplant are involved in invisible online workforce — namely, such a force is increasingly in demand for training smart machines. Smart systems are gradually penetrating everyday life, AI is increasingly being used by society. Today's limited versions of AI run everything, from voice virtual assistants such as Amazon's Alexa and Microsoft's Cortana to computer vision systems underlying the autopilot in Tesla cars.

These systems teach how to perform tasks that have historically been too difficult for computers, and they range from understanding commands spoken aloud to recognizing pedestrians on the road.

Often, a large number of labeled examples are used to teach AI systems to solve these complex problems. They are fed a huge amount of data, pre-marked for critical characteristics of the problem. For example, these may be photographs with notes on the presence of dogs on them, or sentences that indicate whether the word “key” refers to locks or springs. The process of learning machines with examples is called supervised learning, and labeling is usually carried out by turkers and other online workers.

Such training requires huge amounts of data, some systems require millions of examples to perform work efficiently. These sets are great, and are constantly growing. Google recently talked about a set of Open Images Dataset with 9 million images, and in the repositoryYouTube-8M contains 8 million tagged videos. In ImageNet , one of the earliest bases of this type of data is contained more than 14 million images, divided into categories. For two years, it was created by 50,000 people - most of whom were hired through AMT. They checked, sorted, tagged nearly a billion images from potential candidates.

Because of the scale of these data sets, even distributed among many workers, each of them has to repeat the same action hundreds of times. This job is black, and extremely tiring mentally.

In addition to label distribution, Turkkers and other workers often clean up the data sets used for machine learning. Remove duplicates, fill in voids, etc.

With the general proliferation of AI, each technology firm involves people in such microtasks related to machine learning. Amazon, Apple, Facebook, Google, IBM, and Microsoft — all the largest technology companies — either have their own crowding platform, or outsource these tasks to outsourced companies. Of these companies, the largest are Amazon Mechanical Turk and CrowdFlower.

Internal micro-work platforms, such as Microsoft’s Universal Human Relevance System (UHRS) or Google’s EWOK, are used quite extensively. About five years ago, after the launch of UHRS, it was known that the platform was used in the Bing search engine and in various other Microsoft projects, processing 7.5 million tasks per month.

According to Mary Gray, the lead researcher at Microsoft, UHRS is very similar to the Amazon Mechanical Turk. Gray claims that the company uses UHRS to recruit workers in regions where "the impact of an Amazon Mechanical Turk is not sufficiently represented" or in the case of sensitive secret assignments.

“Every company interested in automating services refers to some analogue of AMT-type platforms. And in fact, many of them use AMT directly, ”she says.

Chris Bishop [Chris Bishop], director of the research laboratory at Microsoft's Cambridge Research Center, says that UHRS allows the company to be “slightly more flexible” compared to external platforms, for example, with the same AMT. He says that the firm uses AI to automatically identify the strengths and weaknesses of employees, such as the relative level of expertise, which helps the company assign different ratings of importance to the results of these people’s work.

In addition to assisting AI training, platforms such as AMT are used by such well-known brands as eBay and Autodesk — they reset repetitive and routine work, which for many years now constitutes the majority of all assignments in AMT.

This monotonous work, which does not require skills, includes many tasks: viewing pictures and other content created by users (sometimes it leads to unpleasant experience), marketing and scientific research, deleting duplicate entries, checking product descriptions and images of Internet stores. Amazon created AMT for itself, for managing the assortment, categorizing images and products, creating descriptions, extracting names from email, translating text, transcribing text from audio and images, correcting spelling, checking geographic location, creating feedback on web design, reviews products, the selection of frames representing the video, and getting advertising companies information about which part of the ad you paid attention.

How did we get to this?

There is nothing new in the idea of ​​assistance provided by people to machines in carrying out tasks that otherwise would have been unaffordable for the latter. Although the recent takeoff of AI incredibly increased requests for categorization of data, such microtasks, according to Gray, were met about 20 years ago, when such work was associated with attempts to improve spelling processing by word processors such as Microsoft Word. In a broader sense, the work of clickers and the execution of microtasks occurred even during the rise of online stores during the dot-com bubble in the late 1990s and early 2000s.

In 2001, Amazon, searching for new ways to efficiently organize products in its fast-growing store and solve warehouse problems that did not succumb to computers, patented a hybrid machine / man system. Four years later, Amazon achieved its goal of building a digital platform for accessing a large number of online workers by launching Amazon Mechanical Turk.

The approximate number of active participants in the AMT project, from July 2015 to October 2016, the

idea of ​​having access to "artificial artificial intelligence", as Amazon described its project, was liked by various companies; all of them, from online stores to porn sites, were looking for opportunities to sort their products inexpensively.

In 2015, an average of 1,278 customers placed their tasks at AMT. And although the amount of work done by tireless workers is increasing, especially at sites such as CrowdFlower, its exact amounts are unknown, since quite a large part of it is conducted without records or re-issued to many employees.

And although, according to information from the Amazon site , 500,000 people have already signed up for work at AMT, it is not clear from these numbers exactly how people use the crowdfunding platform - as a full-time job, or as a part-time job.

A report from the World Bank, The Global Opportunity in Online Outsourcing, estimates that the two largest microtask platforms, Amazon Mechanical Turk and CrowdFlower, share revenues of about $ 120 million in 2013. Professor Willy Ledonvirta [Vili Lehdonvirta], Associate Professor and Principal Scientist at the Oxford Internet Institute suggests that this amount ranges from 5 to 10% of the global labor market , but indicates difficulties in obtaining real employment numbers for platforms not in English.

Other cost of clicker

The monotony of such work can have bad consequences for people performing it. She can seriously upset the physical and moral health of some of them.

“I wake up ignoring everything else,” says Miland. “My family cooks food for me and leaves it so that I can eat while I work.” I eat at the computer, I do not see my family. If my daughter needs help with her homework, she needs to contact her father. It got to the point that a hygroma developed in my wrist . I constantly had ligaments in my hand. I was lucky that I worked like this at a time when my husband was at home, when he had no work. If the family heard the call from my computer, which meant high-paying work, they said to me, 'Come on, come on, come on!' ".

Turker from South India, Manish Bhatia, was a moderator-volunteer on the MTurk Forum for almost two years, and now he is moderating two forums. The strangest thing he was asked to do was to remove himself lying in the bath surrounded by rose petals. “It was very strange,” he says. About the strange images, he also complains that sometimes he has to see unpleasant pictures. “Nothing is known in advance,” he says. “You can avoid working later.” But in this case you will not be paid, and time will be lost.

Myland also complained about this experience. “People say to me, 'Wow, do you work at home? You're lucky! ”She says. - You can’t tell them that today I assigned tags to the pictures, and all the pictures were related to ISIS. There was, for example, a basket with severed heads. And I saw this quite recently. I had to affix tags to the video with a burning man. They paid 10 cents per photo. ”

Not only does Miland have to tag to graphics or incomprehensible images. “Yesterday, on the next set of videos from YouTube,” says Laplant, “there were a lot of beheadings. At the bottom there is a checkmark titled "inappropriate content", and you click "Send". This work can be important to prevent the appearance of unpleasant content online, but it can also harm people performing it. Pay for work does not always correspond to the value of the work done for YouTube or its users.

Little says he often had to tag pornographic videos or photos. “And I made exceptions only when I met child pornography,” says Little. “I reported this to the requester in Amazon as well.” But about the krovischisch and bullying, Little says that this is "part of the work."

Upon completion of the task it is impossible to know what is happening with the result. „I wonder if anyone will view this? I hope that this will be reported and deleted, - says Laplant. - Someone stumbled upon child pornography, ticked, but will anyone check and investigate it? We do not know".

Requesterers work under pseudonyms, and no one knows who ordered this work. Laplant calls it the "Wild West." And while requesters assign ratings to turkers, turkers do not have the ability to rate requesters.

"You mark faces in a crowd, but maybe someone is preparing something for a malicious purpose, or something like that," she says. “You don't know what you are doing, there is no information."

“This is called substitution trauma,” says John Suler, a professor of psychology at the University. Ryder specializing in behavior in cyberspace. - The same happens to people who first saw the terrible images. They get hurt. ” According to him, we do not always understand the psychological consequences of this. “Our mind becomes insensitive,” says Suler. - But the subconscious does not do this - it absorbs everything. We underestimate how everything we see online affects our subconscious. ”

People working this way find online forums to connect with each other and share stories, express sympathy and support each other. “There are a lot of questions about payment and content moderation,” says Miland. "And in such places you can find social support."

Each of the public platforms has its own characteristics. The MTurk Forum is like talking to an office cooler. In contrast, according to Miland, Mturkgrind "seems to be more focused on productivity and efficiency." TurkerNation "concentrates on answering questions and helping newcomers understand the system."

There is also a closed group on Facebook called Mturk Members, where there are already 4436 members . They ask questions, praise their earnings and support each other.

Laplant and three other women created the MTurk Crowd forum to help the Turks find the right resources and do the best work. There are many other forums, subreddits and other online platforms.

There is a site for employees It was there that the campaign " Dear Jeff Bezos " started . The campaign tried to humanize Turkers, to give the right to vote to people actively participating in the life of this platform. They shared experiences and expressed concerns about the nature of their work.

But it changed little. Although workers in India had the opportunity to receive a non-cash transfer, neither Amazon nor Jeff Bezos directly addressed the campaign.

Somehow to communicate with Amazon is almost impossible. “The lack of support is annoying,” says Batia. - There is no chat, no phone. The only method of communication is emails, in response to which standard replies are sent.

“I’m quite taken aback by the decisions they make,” says Little, “and the first of these is the lack of communication. Why don't they want to do this? It is unlikely because of the possibilities of lawsuits, because their rules of work directly prohibit such things. "

Myland talked to lawyers, but “none of them will ever support an employee in the fight against Amazon,” she says. Amazon refuses to communicate. "Neither about refusals, nor about improvements, nor about our proposals to increase their profits - in any way."

Lilly Irani, who teaches at the University of California at San Diego, is exploring "cultural policies for high-tech work." Iran participated in the studyin 2013, during which scientists studied in detail the forums of the Turkers. The study was conducted in an attempt to understand how joint actions can work - for example, projects such as Dynamo, the Turker's collective platform, and Turkopticon , which allows Turks to write reviews of assignments and rate them. In the work “Turkopticon: interrupted invisibility of workers in Amazon Mechanical Turk,” the authors noted: “We claim that AMT is building infrastructure based on its employees and hiding their work, turning them into a computing resource for technologists.”

Despite poor working conditions, Myland and other people rely on income from AMT. Myland's health does not allow her to hope for traditional work. “I tried to get a job at McDonalds, and they didn't take me,” she says.

People work together with AI

Gray from Microsoft believes that on-demand employment of this kind will gradually grow into a person + AI system, in which a symbiosis of people and machines will appear.

Table of tasks and payment Christie Myland. Submitted - completed, but not yet recognized legitimate assignments. Approved - paid. Rejected - the work was completed, but the requester did not accept the results or did not pay for them.

She refers to the emergence of virtual assistants, for example, Facebook M , or chatbots support, for example, Amelia from IPsoft, where people process requests using AI, or AI processes requests, and a person takes control in cases that the machine cannot handle. Over time, such systems are trained on the basis of people's responses and gradually increase the range of requests they process.

There are more and more services that use specialized AI to perform simple tasks and people for more complex ones. One of the main crowdsourcing centers, CrowdFlower, recently launched a machine learning platform designed to perform tasks that people would have previously solved. People need to “ concentrate on more complex cases and help MO models to learn .” This approach automates the mountain of manual work, but optimistic forecasts say that although the percentage of work performed by people will decrease, the total number of jobs will not decrease, as the number of requests for the use of human + AI systems will increase.

How long will cars still need people?

But how long will people still need to train smart systems? AI is already coping with many tasks that were previously performed by people.

In 2006, a year after the launch of AMT, Amazon director Jeff Bezos said that earlier people needed a person to understand whether there was a certain person in the photo, and now this system can be solved by a deep learning system, neural networks operating in companies like Baidu, Facebook, Google and Microsoft. Does this mean that microtasks that provide people with employment today, tomorrow will move into the field of machines?

Ledonvirt does not believe that the demand for microtasks related to AI will be satisfied. He predicts that the more tasks can be solved using machine learning, the more data will need to be processed by people. “This is a moving target. There are so many variants of tasks that I do not think that such work will end soon, ”he says.

Bishop believes that in the near future, AI will train hybrid - both through supervised learning under the guidance of people, and through uncontrolled. Gray believes that people’s participation will be needed for a very long time: “Moreover, the need for people will increase because the number of tasks to be automated will increase,” she says. “If we take as an example early examples of natural language processing or pattern recognition, it will become clear that there is more than enough work in the system.”

Dr. Sarvapali Ramchurn, an associate professor in the Department of Electronics and Computer Science at the University of Southampton, uses an example of image recognition to illustrate the amount of work that people still need to do. “We have not come close to the restrictions. The markup of images still requires human participation in any area in which these images were collected. ”

Photos can be taken in such a huge number of types of environments - in the light, in the shadow, partially obscured - that “even after classifying 50 million images, only a small part of the objects captured there will be accurately classified in all possible contexts,” he says. And he adds that if we expand the scope of work to recognize speech, understand natural language, recognize emotions, and many other areas in which AI is used, it will be clear that the flow of work is not going to run out. In addition, society is constantly discovering new applications of AI. "Requests are likely to only grow, and we will see more systems combining the work of people and AI in new ways to solve real problems."

Work as a service

Whether people will be needed to train AI in the future or not, the growing popularity of AMT-type platforms reflects the ongoing shift in the labor market.

Gray believes that just as the acceleration of global communications allowed outsourcing more and more business tasks to outsource, so crowdsourcing platforms, along with an abundance of people with broadband Internet access, will change the labor market. “We were able to break the work of a full day into parts so that different people in different time zones from different places would perform it around the clock,” she says. "We are not that simplified work or lowered the qualification required for its implementation, but rather, it was divided into modules, which can be taken by different groups of people." Gray believes that in the future, this way of working, switching between microtasks will become generally accepted.

Since online platforms are better at learning how to quickly connect customers with executors who have the experience they need to complete a task, then the use of micro-work practices will spread and grow, she says. “We are witnessing an industry of work that is assigned, planned, managed, paid and sends the finished results through the API,” says Gray. "All this is developing with the speed of the explosion right under our noses."

Ledonvirta shares the vision of Gray in the sense that computer systems will increasingly manage the distribution of labor. “Things like organizing work with a computer, using special platforms to regulate working relationships are gaining popularity,” he says.

With the growing number of people connected to the Internet and the popularity of crowdsourcing platforms, governments need to start paying attention to how this affects people's lives, Gray says. “We have yet to understand how this approach will change the way most people work,” she says. “This process has been going on for 30 years. We did not pay attention to him, because he did not affect people in power and their children. ”

Also popular now: