Practical use of neural networks

Surely, many people remember the 4th series of the 4th season of Silicon Valley, released last year, in which Dzang Young wrote the Not HotDog application.

As it turned out, in fact, it was a real application that made HBO specifically for this series and Habr wrote about it .

Well, we will tell you how the bot was made to determine not only hot dogs, but also many other items, as well as to determine the sex and age of people from a photo.


We were not going to do neural networks. We just wanted to make a project to increase the productivity of merchandisers in retail networks.

The duties of merchandisers include, in particular, checking the availability of goods on the shelf, for this they must visit outlets almost every day and report the presence / absence of goods to the supervisor.

As a rule, several shops are assigned to the merchandiser and every day they go out into the fields - to the outlets assigned to them.

Usually merchandisers are obliged to photograph their regiment and send these pictures to the supervisor - as if proof that the merchandiser was actually in the store.

In practice, merchandisers, who are the lowest level in the sales staff hierarchy and receive very little money for their work, do not always work in good faith, sometimes they don’t go out into the fields, but send old photos to their supervisors. They are fired, they go to other networks, the process is repeated again - in this position there is always a large turnover of personnel and there is a constant set of merchandisers.

Managers make every possible attempt to improve the control over merchandisers - they introduce tricky applications with geo-coordinates marks, with the impossibility of sending old photos, etc.

Also, secret buyers are hired to control merchandisers - they have to take pictures of a shelf in a store, display materials, etc. There are even companies that are looking for such secret buyers among students, schoolchildren, etc., and sell such services to retail. But then the question arises - and who will control the secret buyers, i.e. Everything that depends on the consciousness of the person, needs constant monitoring. And merchandisers still find ways to circumvent the control, in general, the problem of the shield and the sword.

And the idea arose to completely abandon the human factor. Our solution - we provide a visual control of the display of goods and the control of the availability of goods on the shelf in general without the participation of merchandisers, and we do it 24/7.

The fact is that our messenger has video surveillance functionality, i.e. you can put an inexpensive smartphone on the spot and give access to view to all interested parties - merchandiser, supervisor, manager, etc. Thus, at any time you can see in real-time what is happening on the shelf, respectively, the merchandiser always has relevant information - whether he needs to go to the object or not.

The supervisor can also monitor the work of the merchandiser at any time, and the head, for example, if this is a federal network with a large number of representatives in the regions, can see what is happening with his product in any city and at any time.

A reasonable question arises - why not use inexpensive camcorders for such a task?

The answer is the simplicity of the installation of video surveillance using a smartphone and ease of use in the messenger.

In most cases, for an inexpensive video camera that has only a Wi-Fi connection, you need somewhere to take this Wi-Fi and most likely you will need a router with a 3-4G modem, i.e. You need two devices already. In addition, the smartphone already has a battery, that is, there are no problems during a power outage.

For the correct operation of the router, it is necessary to make certain settings by more or less qualified personnel, and in the case of a telephone, the video surveillance mode is very simple and can be performed by almost any user.

Also, to view a large number of cameras, you need a special software, you need to think about access, give out logins, passwords, and in the case of the messenger access to view is organized very simply - the list of cameras allowed to him appears to the desired user and that's it.

The cost of a smartphone is also small - from $ 25-30 at retail. For smartphones, there are many types of mounts, there are small smartphones that can be placed, for example, in shelf lights, etc.

The problem is $ 8 billion.

In the process of immersion in the topic it turned out that for example the problem of goods on the shelves in stores (OSA - On Shelf Availability) - worldwide and as a result of the lack of necessary products on the shelves, the global industry loses up to $ 8 billion a year.

There are many startups that solve this problem with the help of neural networks - the merchandiser makes a photo of the shelf during its appearance in the store, sends it to the cloud, the neural network compares the photo with the planogram and sends the result in the form of prompts - what positions are right, what products are not on the shelf etc.

But the human factor is also present here - the employee came in the morning, took a photo, placed the goods according to the rules and left. And then, literally in 5 minutes, a bus can arrive with customers who will change everything that he did, and the supervisor will think that everything is fine.

Therefore, in our opinion, it is better to do analysis during the day several times; moreover, such analytics can help identify patterns in sales of certain goods.

To implement this idea, we decided that it is better to take several photos during the day and periodically send them to the recognition in the cloud.

But we had no experience with neural networks, and it seemed rather difficult to make our own engine and then train it.

Therefore, we decided to take some ready-made solution. To some, this approach may seem wrong - you have to pay for processing images in the cloud.

But there are counterarguments - to make your own engine - an expensive and long pleasure and the neural network must be trained, and this is also a laborious process.

In addition, using a ready-made solution, you can quickly roll out the finished product, rather than engage in its creation, stepping on all sorts of rakes and learning from your mistakes. And we did not want to become specialists in neural networks - for us they are just a tool for solving specific problems.

In addition, there are already many platforms on the market that can be used - Amazon Rekognition, Google API, etc. As these platforms evolve and the competition between them develops, the price will only fall.

Therefore, we decided to use IBM Watson with its visual recognition engine.

Visual Recognition Bot A bot by

-product of a project to control the display of goods on the shelves was a bot, which we called Visual Recognition.

The bot knows how to determine all sorts of objects from a loaded or taken photo, and also knows how to determine the gender and age of people from a photo.

We also placed the bot itself, its algorithm on IBM Watson and, accordingly, it also uses the Visual Recognition engine with a more or less trained neural network.

On the Bluemix platform, the bot looks like this:

How to use the bot

Download M1 Messenger for Android or for iOS .

After registering in the search, we find Visual Recognition Bot:

Add to bot: The

bot will create a chat in the Business tab:

Now you can upload photos to it:

Click Processing and get the result:

So, the hot dog determines, try the hamburger:

There is a fairly popular Vivino application in which the user can take a picture of a label from a bottle of any wine and get all the characteristics, rating, prices, etc .:

Using Visual Recognition, a bot can be made quite simply about the same thing for beer, vodka, etc. d. - IBM Watson has a learning module.

Well, the recognition of age from the photo:

In addition to gender and age, the neural network is still trying to determine the profession of clothing:

Determination of age, gender and coordinates in the photo:

Also popular now: