Smart Systems Developer Manifest: 15 Principles

Published on December 17, 2018

Smart Systems Developer Manifest: 15 Principles

    We bring to your attention the article of Vladislav Zaitsev ( vvzvlad ), the invited guest of our blog. Vladislav has long been engaged in the topic of "smart homes", and summarizing his experience, he proposes the following basic principles for the design of such systems.

    Today I want to talk to you about smart homes in particular and IoT devices in general. But it will not be an ordinary article: there will be no hardware, links to manufacturers, pieces of code and repositories on the githaba. Today we will discuss something more high-level - the principles by which "smart" systems are organized. Continuing to read the article, you agree that you are satisfied with the next disclaimer.

    image



    Actually, the disclaimer itself
    1. Все эти пункты касаются только потребительских IoT-систем (читай «умных домов»). Тех, что человек может купить в магазине и установить без привлечения специализированных инсталляторов/интеграторов.
    2. Часть этих принципов не применима к промышленным системам (там свои требования и принципы), а также, к системам, где есть отделённые от пользователя эксплуатанты (например, умный дом, который устанавливается и обслуживается специально обученными людьми).

      Также часть принципов не применима к системам уровня «игрушка для гиков», к самодельным и open-source системам, у которых нет единого product owner.
    3. И, конечно, всё написанное ниже — это исключительно моё мнение, основанное на моём многолетнем опыте. Вы имеете право не соглашаться с ним.



    A smart home is a system that takes on a part of a person’s daily worries. From here follows the first and most basic principle:

    1. A smart home should make life easier and simpler.


    A smart home is a system for living, not a toy for geeks. Any system using one that is more difficult than conventional switches - not smart home.

    Any new product should be tested for compliance with this principle. If she does not make life easier, and you do not understand how to make it easier, she has no place in a smart house. You can try to present it as a learning system.


    The second most important principle concerns how the user interacts with the system:

    2. Good user experience is more important than functionality.


    A penny price cool tool that can not be used normally. Convenient and reliable devices with limited functionality are more likely to succeed than complex products for “all occasions”.

    2.1. Convenient interfaces are better than customizable.


    You do not understand how to save and a bunch of functions, and a simple interface? You push all the functions in a row in the hope that the user will figure out what he wants? Get out of the profession!
    You do not understand how to combine convenience and a bunch of settings? Donate settings. Any functionality will be cooler than a regular switch, but excessive complexity will easily force the user to return to the switch.

    The same applies to the quality of work. A button that simply turns on the light is better than the brightness slider, which is sometimes buggy:

    2.2. The quality of implemented functions is more important than their quantity.


    Little reliable, but proven functionality is better than a lot, but working somehow.
    One of the mechanisms enshrined in the human psyche by evolution is a more active reaction to negative stimuli. Negative factors are more important than positive ones: to miss the approach of a dangerous predator is much more terrible than not noticing and not eating a tasty fruit on a tree. If there are no functions in your system, it’s not scary, it’s just the absence of a positive stimulus. But the function that exists, but working poorly, is a negative incentive: it is remembered more easily and is remembered longer.

    2.3. The introduction of the system should not reduce the usual speed.


    Delays are not allowed, as they impair the user experience. This is also a negative incentive. If a person does not notice the delay 2 between the click of a conventional switch and the turning on of the light, he should not notice it in your system.

    Modern iron works at a very high speed. There is no problem to reach frequencies of tens of megahertz on microcontrollers and at least tens of kilobits per second, even over the radio channel. If you cannot make a system in these conditions, the user does not feel any delays in the work - out of the profession!

    2.4. The system should not break the already formed user habits.


    Your system is only a small piece of human life. The lifetime of a person exceeds the lifetime of the system at best a couple of times, and at worst - an order of magnitude. Your system will go away just as it comes, and human life will continue. And in this life, a person has formed habits regarding the preferred brightness of the light, the location of the switches, how it is convenient for him to turn on the light and control the climate at home.

    You can not try to force these habits to change. Offer - you can. To force - it is impossible.
    You cannot say to the user "now you will turn on the light from the phone - it is stylish, fashionable, youthful." This is a violation of this principle and several others.

    And what to do?

    2.5. The system should bring new experience, and not try to replace the old one.


    If you think that your way of managing systems at home is better than the old one, offer it to the user. If it is really more convenient, he will choose a new one (yes, different methods are suitable for different people). All you can (and should) do is give him a choice.



    An important place in the smart home is the logic of work. What determines the rules for this smart home will work. And the following important principle is just about this:

    3. The user can not be limited to the available logic.


    If the user wants to turn on the kettle 3 when the temperature in the room rises , give him the opportunity to do so. The situation should be excluded when there is no physical or programmatic limitation to perform a certain action, but it is not available because the developer thought “this will never be necessary for anyone”.

    3.1. As simple as possible: for the writing of logic should not require special knowledge about the structure of the system


    If you have devices of different versions with different protocols, make sure that the user knows about it only in really necessary cases, when you cannot do without it. If you can save the user from gaining special knowledge, even at the cost of developer time, do it. The developer will spend two days, and a thousand users by the hour. Forty-eight hours against a thousand? The answer is obvious.

    How do you sleep, John is a serial programmer?

    3.2. Devices with the same functions should be controlled in the same way.


    The user is not required to know that the valve for water is controlled by some commands, and the crane - by others. If both of them control the water in the pipe, then both of them should have the same interfaces at the user level: “open water” and “close water”.



    We all live in the physical world. The human body and brain are formed to interact with physical objects. Hence the principle:

    4. Physical control devices are better than virtual ones.


    Any, even the best applications on the phone with virtual control buttons lose the usual physical switch in the right place.

    Another thing is that the switch should be placed exactly in the right place . Hence another important rule:

    5. Radio is better than wires.


    Wired systems are reliable, but only the radio allows you to install a new switch or relay for the lamp without making repairs again. And move if you are bored with the previous place. But this principle has its exceptions:

    5.1. Bad radio is worse than wires.


    A good radio is one that allows you to not worry about how far from the central controller you need to place devices. Good radio are protocols with a mesh network : ZigBee, Z-Wave, 6LoWPAN, and so on.

    All other options - this is a bad radio. Wifi is a bad radio. The self-made protocols of individual companies (they are known by the homeowners under the name "433 MHz", although they can be on other frequencies, and very different from each other) - a bad radio.

    Wi-Fi is bad because on its basis it is impossible to make full-fledged "sleeping" devices, and it is difficult to make automatic settings, as well as compatibility issues with other Wi-Fi devices in the house. Simple home-made protocols are bad because they often contain no delivery control, no encryption, no available specifications. Neither one nor the other has mesh-routing, which often turns into problems like "but I can’t put the switch in that corner - too far from the transmitter."



    You can not make a system with one hundred percent reliability and without the need for maintenance. Devices break down, power surges occur in the network, water pours from the neighbors' apartment on top, batteries sit down, plastic cracks, the child pours soup onto the lamp, and so on. But:

    6. Repair, upgrade, maintenance and diagnostics should be simple.


    In B2B, everything is clear: there is a user, there is a developer, and there is an operator - a person or organization who knows how the system works and can work with it on a professional level. No one requires an accountant to have programming knowledge under 1C, this is a special person. And no one demands from a person who rents an office, understanding how ventilation works in him - his business is to say: “It's stuffy in our office.”

    A competent decision to purchase a system is based on the calculation of the total cost of ownership, which consists of the price of the system and the cost of its operation.
    In systems that a person uses at home, everything is more complicated. There, the operator and the user are one and the same person, and in addition often do not have the necessary knowledge about the system. Unfortunately, these are the limitations of the consumer market. Hence the following principles:

    6.1. The device must either work or not work. There is no third.


    Probability of work, partial work and incorrect work are not allowed. You can not allow situations in which your device works once or does not work once out of ten. If you think your device is malfunctioning, you should disable it altogether - for safety and to preserve a positive user experience. Replacing the device is unpleasant, but it is better to force the user to replace it than to form an opinion “works through time” about the system. The system must be in a strictly defined state: if it is broken, it means that it does not work, if it is not broken, it means that it works.

    If you are firmly convinced that the degradation of functionality is permissible, you should still warn the user with a clear message: “A failure of the second channel of the dimmer has been detected. It is necessary to replace the dimmer. Continue to use the first dimmer channel? Well no"

    6.2. Replacing a broken device should be easy.


    The system must be modular. If the user breaks the sensor, it should be necessary to replace only the sensor. You cannot demand to change the relay together with the control panel, because, as you see, it is attached to it at the production stage.

    You can’t even say “only our specialist can install a new sensor”, because obviously, with the development of your system, there are not enough specialists for millions of possible users, which means that problems will start at some point. Of course, the user will not repair the device himself, but in case of breakage he should be able to replace them.

    6.3. Clear messages.


    If something went wrong, the user should know about it, and know what exactly went wrong.
    You cannot say “Error # 45” to the user, meaning that only technical support staff will understand this message. He needs to say: “The device is not responding. Try to restart it, reattach it, or contact the service. Error # 45.

    It is impossible to detect that the device is not responding (if you have the opportunity to do this) and not tell the user about it. It is not very pleasant to receive a message about problems, but it is much more unpleasant to understand that the device has not been working for a week, at the very moment when it is urgently needed.

    6.4. No extra information in the messages.


    But at the same time, you do not need to dump debug dumps and multi-line logs to the user. The information is either needed by the user, as in the previous paragraph, because it includes information about what exactly went wrong, or is not needed if this additional information is not clear, because it is not addressed to him.

    No need to show the user a hundred messages of the same type: “communication with the device is lost”, “communication with the device has been restored”. Finally, decide: either this is a critical problem, and you need to report it correctly - “Unstable communication with the device” - or, once the connection has been restored, this is not important information, and you don’t need to show it 4 .

    6.5. For maintenance should not be required special knowledge and equipment.


    Give the user the opportunity to update the firmware - let it be updated simply by pressing a couple of buttons. And it will be done either through a standard interface (USB / BT / Wi-Fi), or do not mention in the user documentation about updating the firmware using the SPI programmer at all.

    You cannot require the user to calculate bit masks for device configuration, if this configuration is required in normal operation.

    6.6. The system should not require constant maintenance.


    If once a month the binding unit flies off at the executive unit, and the user has to climb to the chandelier and tie it again - this is a bad system. If you need to change batteries every two months in a switch, this is a bad system. Even the average time needed to service each device in half a year is bad: for twenty devices in a house, the user will have to take some action on average every nine days.

    6.7. The system must have the ability to upgrade or expand.


    The cost of expanding the system should grow linearly. The new unit should cost the cost of the new unit.

    There should not be a situation when you need to buy a new controller to add a new sensor, because the old one does not support more than six sensors 5 .

    There should not be a situation where the new firmware of the sensor can work only with the new version of the control unit.

    Such restrictions have a positive effect on profits, forcing users to buy new devices, but this is the road to hell - if you lose the trust of users because of such tricks, you will lose much more than you could earn.

    7. Self-sufficiency: external networks are an option, not a necessity.


    A system in which commands go only through an external server is a bad system. You can boast of convenient interfaces, trendy applications and cool neural networks for as long as you want, but it doesn’t matter if the user has lost the ability to turn on the light in the toilet together with the Internet crash. Developer, do you really think such a system is good?
    But seriously, I don’t really understand why this item is not a matter of course. By tying the system to an external server, you create a point of failure with the reliability of a regular server, but at the same time for all your users. External services are cool, they allow you to extend the functionality, but should not be the only option of operational management. Additional control channel, backup storage, data analysis - as many as you like.

    8. Centralization: the lack of a central hub limits the available logic.


    However, you should not rush to the other extreme - try to make the system decentralized, and therefore absolutely reliable.

    A decentralized system is when the switch tells the light to “turn on”, and the temperature sensor switches the heater on when the temperature drops. A distributed system loses to a centralized one simply because such a system only exists well within the framework of the simplest paradigm of interaction between devices - “the device controls another device”. As the complexity of the system increases, such a system raises more questions than answers. If there are several temperature sensors, then the heater should receive commands from everyone? Or should he interrogate the sensors? And if you need to make a decision on inclusion based on the trend, where to store archive data? On each heater? And not fat? On each device? And drive traffic with each request? And where to store and how to change the logic? And if the logic includes external elements? And how to store logic on "stupid" devices? And how to update it?

    All these issues disappear if you accept and acknowledge the need for a central hub as a point of interaction between data and a place to store user logic. Let all devices be “stupid”, capable of giving data and receiving commands, and the task of storing historical data, processing this data, making decisions and interacting with external services will fall on that very central controller.

    Decentralization, by the way, can be preserved, albeit partially: nothing prevents you from sending commands directly in the absence of a response from the hub, as a fail-safe mode. There will be no logic, but it will be possible to turn on the light.



    9. The system should not perform potentially dangerous actions without confirmation or notification of the user.


    As long as we do not live in a world where the programmer is responsible for the damage caused by his program (well written on this topic here ), there will be a lot of errors in the programs. They will be in the software of a smart home. The only possibility in which these errors will not lead to catastrophic consequences is the restriction of independent actions of the system. Potentially dangerous actions should be performed at least with the knowledge of the user, and better with his confirmation. The light can be turned on automatically: a bug in the program will cause the user to be awakened at night or receive an account for an extra hundred rubles at the end of the month. Unpleasant, but hardly dangerous. For example, it is impossible to turn on water automatically if there are no mechanisms guaranteed to shut it off during a flood. But to turn off the water - it is possible, since it is not a dangerous action.

    This principle does not state that it is necessary to prohibit the automatic control of all heaters, boilers, stoves, kettles and the like. Rather, it’s about the fact that if you already make a potentially dangerous device with programmed control, make sure that its danger does not reach the user level: there should be “iron” chains in the controlled kettle that turn it off when overheating; bath, which is poured itself, should have sensors flooding; the iron should be able to turn off, but only be turned back on by pressing the “iron” button, and so on.

    10. The system should be able to self-monitor and self-test.


    A true developer of smart systems should be a bit paranoid. You can not trust the Internet, it may disappear. You can not trust the code, there may be bugs in it. Maybe even gland can be trusted? Not. You can not trust your gland. The relay can stick, triacs open spontaneously, fuses blown. Finally, the user can plug a kilowatt kettle into a 100-watt outlet. Need sensors for voltage, current, temperature. Is the temperature out of bounds? Turn off everything, send a warning. Out of current - turn off. Disconnected the relay, and the output is still voltage? Notification. Turned on, and there is no voltage? Notification!

    11. The system must be able to manual control.


    And even with all these paranoid things, there will still be a situation where the checks failed and something went wrong with you. The switch in the room, the router, the central hub. And the user wants to turn on the light in the toilet. What is the conclusion?

    There should always be a button that can be used to manually set the sunset. With which you can turn on or off the VOICE LIGHT. Because the new hub will be delivered tomorrow, you can buy a new switch a little later, and you want to turn off the lights in the room this evening to sleep peacefully.

    12. Developers and hackers are just as important as regular users.


    Regular users will make you a cashier, and hackers6 - come up with new features. The manufacturer cannot and should not develop all the scenarios for using the system, since he obviously does not have knowledge in all areas and cannot evaluate the effect of developing such areas. It is possible that your controlled outlet will be incredibly convenient to control the thermostat of the moonshine device, because the solution library has a PID controller, and now all the moonshiners are buying your systems in large quantities. The example is somewhat contrived, but the main message is that at the cost of some effort it is worth creating a comfortable environment for hackers, because they add to the driving force of your system.

    13. Openness: the system must have an API for integration with other systems.


    You cannot cover all customer needs with your devices. There will always be devices that you do not have, but there are other manufacturers. Or you have it, but other manufacturers have better. Or you will be such a manufacturer, whose single device is the best. An open and well-documented API is required to connect your system to others. If you do not have an API, you deny users the ability to build heterogeneous systems. Even companies, the most ardent supporters of proprietary hardware and software, cannot afford this.

    14. Documentation: documentation is as important a part of a system as code and hardware.


    Whatever cool hardware you have, and no matter how cool the software is, it will not matter if the user does not understand how to start your system and how to work with it. Good documentation is one that the user doesn’t have the thought of contacting support or negatively assessing the developer’s mental abilities. To write such documentation is almost impossible, but we must strive for this.

    And finally, the last principle (but not the last one in terms of importance):

    15. Self-sufficiency and self-sustainability: the system must continue to operate, even if the company stopped working


    A person lives 70-90 years, the system in his home is 5-10 years (at best), and companies are often smaller. You should not make a system, the possibility of working with which you drag into the grave behind you. Pity the users. Design the system so that work with it is possible, even if the company and the developers have long sunk into oblivion.

    Binding of a new device occurs when a token is received from a developer's server? Make the “Skip token receipt” button. When you first turn on the system, you need to update the firmware from the site? Make sure that the default firmware is flooded when the site is unavailable. And so on.



    In this article I tried to describe the entire experience of using, working and designing such systems, compressed into 15 basic principles. Some of them may seem far-fetched, some - controversial (and this is normal), some - trivial (and this is also normal).

    But if you thought about at least one of them, then the article was not written in vain.

    Footnotes and comments


    1. Speaking of “use”, I do not mean the configuration process, but the process of normal interaction with the system.
    2. Please note that I am not saying “the delay should be the same,” but “the user should not notice.” Practice shows that a person does not notice a delay of approximately 300 ms.
    3. Perhaps he grew up in Central Asia and believes that hot green tea is the best cure for heat.
    4. Of course, when I say “no need to show information”, it means that you should not send a message to the user about it every time. It should be shown at the request of the user - when you click "show logs" or "show debug information." Do not take users as idiots, but respect their time.
    5. Of course, it is impossible to design a system that supports an infinite number of devices. Restrictions will always be there, but the task of the developer is to ensure that they do not play a role in 99% of cases. A limitation of six sensors is unacceptable. One hundred two hundred devices on the same network is enough for most smart home users.
    6. I'm talking about hackers in the original meaning of the word, according to RFC1983 , and not as hackers.