Tackle and do: why it is sometimes useful to score on the analysis and just develop

    We have been developing Macroscop for almost ten years. And during this time, in the development of intelligent modules, a very solid and serious approach to the creation of new functions has emerged. On the one hand, it is very good. The seriousness of intent comes close to the high quality of the product. But at the same time, solidity can border on the slowness and inoperability of the process.

    Just a couple of years ago, when requests from users came to us to develop something new (not included in the master plan for product development), we had a long time forecasting terms, evaluating the versatility and relevance of the function among a wide range of users. And often they either refused, or estimated implementation dates as very long. But one day we received a request for a large project. In case of successful and quick implementation, the user’s missing function, prospects and scale of implementation of Macroscop were very good. And we started to try! We had a tight time frame, a responsive and helpful user and complete freedom of action.



    And ... everything turned out!

    We made a new feature in a short time. In addition, it was accurate and fast. All were satisfied: the user received the coveted intellectual module, the developers got a great experience, the company - sales.
    This practice marked the beginning of a new approach to the development of intellectual functions in Macroscop: we have become more and easier to meet our users. And it gives its results.

    The most important thing is to identify the real need of the user and together with him clearly formulate the task. When it comes to the rapid development of functions “to order” (hereinafter, we will call them quick functions), it is necessary to stipulate the requirements: what the user wants to see in the end, what to do with accents. Because, conditionally, one needs to work in the first place, others to look cool. When we undertake a fast function, it is not a question of that, as a result, it will work in any conditions with 100% accuracy. We undertake to check the idea itself and try to create for the beginning something that works adequately and is acceptable for use on one particular object. And then, if successful, we refine and bring it to a universal product with good performance.

    When setting the task and priorities are clear, we take on the development. For a short time we are developing a prototype that the user can already appreciate. And give it to the test. If what we have done correlates with what the user needs, and in general he likes, and the method we have used in the development has not yet exhausted itself and has prospects for improving the function, we go further. If it turned out quite differently and not at all, we close the project. And since this happens at an early stage, we lose almost nothing.

    With this approach, developers and users should go as far as possible.each other. The user is also required to be included in the process: it is necessary to thoroughly test the prototype on different cameras and in different conditions, try different settings and press different buttons, give exhaustive feedback: what is convenient, what is not convenient, what does not work as he assesses accuracy, how much the server loads, etc.

    Initially, we satisfy the need of one particular customer, but even before starting work we estimate how universal this function can become in the future, how many people it can help in solving their problems. And further we adapt the quick function so that it is useful and applicable in as many video systems as possible.

    Do you remember how it all began? .. (c)


    The first quick function for us was the module for counting people in the queue . In general, we had it before, but the conditions of applicability were limited: the module worked only in one projection, when the camera looked strictly from the top down. Once a user approached us who needed to count people in a queue under fundamentally different conditions - when the camera looks at the queue diagonally (straight and slightly above).


    in this perspective, the Macroscop module was able to count


    and, in this way, it learned.

    Everyone liked Macroscop, but lacked the cherished function. The project was very promising, and the user was ready to cooperate with us in every way , if only such a module appeared, and the software could be installed on the object. We decided not to miss the opportunity, and began to develop.

    In the last variation of the module, the task of counting people was solved by the classical methods of computer vision, which imposed serious restrictions on the conditions of use. But as part of the new task, the module had to learn how to count people in fundamentally different, much more complicated conditions.

    The group of the development of intellectual functions was divided into 3 subgroups, and each began to try its own method. All of them were based on the use of neural networks.
    First, I tried to transfer to the module for counting people in queues to the infrastructure of the helmet -free detector developed by us (see the article on how we tried to apply modern neural network technologies to find helmets on people's heads). This approach seemed very logical: helmet detection at a certain stage of work solves a similar problem.

    The second group tried to apply a regression neural network . It counts the number of people in the image, but it does not select specific objects, which makes it difficult to control. When training for a regression neural network, a picture is submitted and the number of people present on it is indicated, and the neural network gives one number — how many people it has found. Filling the sample with new images, we sought to teach her to count correctly.

    Unfortunately, we have discarded both methods, since the accuracy of the counter created on their basis was low.

    The third group tested one fairly well - known general-purpose detector.which is able to detect various objects in real time. He is able to search for a thousand kinds of different objects, but not to solve our problem with all its features. We refined this detector, trained it on an extensive sample of our own, and got quite a good result - people count with acceptable accuracy. New samples have improved, and eventually got a prototype, which was already shameful to give the user a test. And his rating was ... positive! He said that in general the solution is already competitive , but the accuracy was not yet high - only 60-70%.

    The first version of the people in the queue was created primarily using clips from this user. We solved the problem - to work specifically with him, - but they understood that if we train a neural network and make a module for one specific project, there can be no further scaling. Therefore, further training was conducted on a more universal sample, which led to an increase in accuracy even without global internal modifications. Then we started working on the packaging of the module - we improved the interface, screwed up various settings, noticed the usability and logic. In parallel, we fixed a number of bugs in our prototype (by the way, one of them unexpectedly accelerated the operation of the module by 7 times), figured out how to reduce the consumption of processor resources, hooked up the work on the video card. As a result, we got an objectively well-working and easy-to-manage module, which analyzed quickly, gave accurate results, was able to work on a video card, without loading the processor.
    Our user was just happy! He went to put a new version in their stores, and confirmed that in practice everything works fine. We managed to achieve 85-90% accuracy (for situations where people in the queue do not completely overlap each other, and they can be distinguished).

    Of course, in the development process, not everything went smoothly, and, for example, between the first prototype and the solution that is now installed on the site, there was a failed version that worked worse than the previous one. But in her experience, we realized what to look for when testing, we learned a number of features of the frameworks used. And considering this, they made a cool final module, and then based on it - another quick function.



    Happy end


    Now the application of the module for counting people in the queue of the new version is being expanded to other stores of this user. And the final version went into production and entered into the Macroscop version, which is being prepared for release. By the way, the user was so pleased with the result and in the whole way of working that he received another request - to make a detector of empty shelves . And we again undertook, and again did (but that's another story).

    If we sum up some kind of result, then for comparison: the development and refinement of the old version of the module for counting people in the queue (4 years ago) took about 8 months . We have made a new module in 2 months (and the first working prototype was transferred to the user in 2-3 weeks).

    So far this is only a test of the pen, and only within the same direction - the development of intellectual functions. In general, we adhere to a more rigorous and thorough approach to the development of the product - with planning, numerous validations of ideas, analysis of demand, and deep testing. What remains the same is the practice of creating Macroscop (whether it is the development of a kernel or video analysis modules) in close collaboration with users.
    There is no certainty that the approach of quick functions should be applied on an ongoing basis and within the whole department, but now we are getting real experience of rapid development, and the users for whom this is done have real benefits from the product.

    In any case, for ourselves we derived several rules, the observance of which is half the success of developing fast functions:

    • Try to meet the user, but do not forget about your own goals: to take up projects that can be scaled, to invest in something that will be useful in the long term.
    • Get to the true goals and needs of the user, identify priorities.
    • Enlist user support. If he is ready to actively communicate, test, give feedback and provide the necessary data (video from a real object, for example), then there is every chance to develop well and quickly.
    • Do not be afraid of failure and treat it as one of the possible results.
    • Do not strive to develop something unique from scratch, but if possible use existing experience: in our case, try to use parts of the algorithms from already implemented modules. And if the resulting solution turns out to be viable - to devote time to research and customization.

    Also popular now: