stanislav_as March 26, 2015 at 11:05

Igor Ashmanov about the future of home robots. Home robots: on the eve of a tornado

Good day, Khabrovsk!

With this post, which is a revised copy-off of Igor Ashmanov’s speech at the Skolkovo Robotics Conference 2015 in Skolkovo Hypercube on March 21, the Lexi project ( VK , FB ) opens a series of posts about the project, technologies used, its own developments and experienced team experience. This post is a kind of introduction, a brief overview of the industry and voices the reader a number of pressing issues.

We attach the full video of the speech to the post:

Hello, my name is Igor Ashmanov. I have been involved in software development for many years (almost 30). Now, basically, all sorts of Internet projects.

It so happened that I contacted a couple of startups, a couple of projects on robots (ed.: 1 , 2 ) . I would not like to use the word "invest" in relation to startups on robots, because, in my opinion, there are no investments now in this area.

For example, here before me Dmitry Grishin spoke here about what he invests in startups on robots. He studies the market, how these startups plan to earn money, study business models, audience and so on. Formally, everything is correct, everything is based on venture science. And it seems to me personally that all this cannot be called investments. It’s, in fact, just investing in this investor’s own education, in understanding the market and understanding what teams are there, what difficulties, what problems, where, what, and so on. Just buying knowledge about existing and future robotics. Because, in fact, in this robotic industry, in that part of which I, in particular, are involved in - these are home robots, or robots of remote presence - there is no market there and will not be there for a long time. There are no business models and no buyers. So far, they are not talking about the market at all, but about a certain very important, special thing that I’ll talk about now.

What is artificial intelligence?

Let's start by clarifying the general concepts. Here everyone is talking about artificial intelligence. I’ll say a few words about this so that we speak the same language. And artificial intelligence is something that must now be present in robots .

We must immediately make a reservation that there are two understandings of artificial intelligence .

First- this is an everyday, mass, sort of Hollywood understanding, it means: an anthropomorphic robot, a robot that naturally speaks, communicates, and so on, while it is kinetic, so to speak, capable robot, it moves, it can take things, do work. He always has self-awareness, emotions, some kind of relationship with people, claims to them, love and all that. Well, in the future, this, of course, comes to rivalry with people, to the seizure of the world, to the terminator with arms in hand, and so on.

This Hollywood view of robots pretty much shapes the industry: those startups and those attempts to make robots that are now emerging. In fact, in my opinion, this is complete nonsense, that is, it’s such an illusion that people in the robotics industry are chasing the same way that the Pentagon often makes weapons, say, supernew generation airplanes following the results of Hollywood films about these airplanes. Overcomplicated, too expensive and not working in essence.

There, in the military industry, of course, everything happens more cynically, because we are talking about very big money. They order a blockbuster to a famous director about a super-hyper-megas airplane, an invisibility that hangs in the air and does various tricks, then they come to Congress and say, “Have you seen? Cool? Now give me the money for it, we’ll have it like a movie. ” And it turns out an ultra-expensive aircraft that does not solve the tasks of the troops, and the promised tricks, in general, perform somehow. Here is the same story, with artificial intelligence.

Well, what does artificial intelligence mean from the point of view of the developer?I’ve been working in this industry since 1983, that is, for more than 30 years. Actually, he came from Mehmat directly to the Department of Artificial Intelligence of the Computing Center of the Academy of Sciences to work in 1983. Those who deal with this AI know the more boring definition of AI: this is such a bunch of optimization methods that should imitate human functions . That's all. Moreover, of course, the imitation of human functions does not occur in the same way as a person performs them. It is clear that the machines that people make, and which ride faster than a horse and fly faster than birds, do not copy horses and birds. They don’t kick over, planes don’t flap their wings (by and large, most of them, at least) and so on.

There is, of course, the direction of artificial intelligence, which is to understand how the human brain works, and this is repeated in metal and electrons. This, in fact, is a marginal direction, and from my point of view it is a dead end. There is no time to explain in detail, I’ll just say that this is my personal opinion.

Thus, AI is a scattering of tasks to simulate the most diverse human functions . The tasks of creating a computer that is self-aware are not among them, at least among developers, and not charlatans.

Well, these optimization methods (well, or machine learning there, as is now fashionable), they gradually transfer part of the tasks of imitating human functions, part of these functions, into the category of solved problems. Before the AI problem is solved, it seems that it has some kind of magic, magic. As soon as this task is solved, it loses all the romance, all the flair of romance that surrounded it. For example, say, 50 years ago, only very cultural, educated people, teachers and so on could check the spelling. Now, of course, no one even thinks of this as artificial intelligence. Everywhere this function is, the same goes for the T9 algorithm that you have on your phone, the same goes for search engines and more. That is, as soon as the tasks of searching for pictures, recognition of tunes in Shazam, antivirus, antispam and so on - and these are all artificial intelligence programs, these are all recognition programs - as soon as they become everyday, everyone forgets that this is artificial intelligence. For users, it’s just like plumbing, there’s some useful thing, it works, and that's enough about it. No one thinks that the T9, say, on the phone is a rather complicated program of artificial intelligence. You think, he picks up the words. And there all the same tricky methods are applied that are used to solve AI problems that have not yet been solved. on the phone, a rather complicated artificial intelligence program. You think, he picks up the words. And there all the same tricky methods are applied that are used to solve AI problems that have not yet been solved. on the phone, a rather complicated artificial intelligence program. You think, he picks up the words. And there all the same tricky methods are applied that are used to solve AI problems that have not yet been solved.

Say, the task of speech recognition , good recognition of any speaker in a noisy audience, or at least in a large audience from a distance, and so on , has not been completely solved . Unfortunately, the problem of normal speech synthesis has not been solved , that is, synthesized speech is still easy to distinguish from a person’s voice, it still sounds rather moronic and without intonation.

Machine translation has not yet been done normally , with acceptable quality, surprisingly, although all the promises of the artificial intelligence industry began with it, generally speaking, this is such an Everest for applied linguistics. And there are actually two of them, Everest and K2, two peaks, the second peak is a dialogue in a natural language. And this, in general, is still not a very solved task.

You all know the online translators, which are in Google or Yandex, which translate pages; you yourself know that they are essentially not a translator, it’s just some kind of rather clumsy tool to help understand the meaning of a page in an unknown language. Translation into an unfamiliar language with its help is categorically not recommended.

Recognition of the meaning of the picture (not a search for similar ones, but thematic tagging) - not done.

Machine vision with acceptable recognition of three-dimensional scenes is an unsolved problem.

Since all these problems have not been solved, they still retain some kind of romance, and it seems that in that future artificial intelligence, in the future robot, which is the assistant of a person, such as Jarvis or Friday, they will all be solved, and this is amazing .

In fact, it is clear when all this will be done, and sooner or later it will be done, and even the robot will probably imitate a person well, in this, of course, there will be no romance. Everyone will look at it as something familiar, useful and uninteresting, like a spreadsheet or booking tickets online.

I think we will still be able to distinguish a robot from a person quite easily; in fact, in my opinion, even here in Skolkovo a couple of domestic Turing tests will be done, and you can see how the robot is recognized, how a person is different from a talking robot (ed.: Lexi also participates in it) .

Thus, we are talking about imitating human functions (possibly by some completely different, inhuman methods), and not about the fact that the mind will start in the machine. I personally do not believe it. This is, in fact, for filmmakers and journalists. You can discuss why, but another time. (ed.: dear reader, we are waiting for your opinion on this issue in the comments)

And now I would like to say a few general words about robotics, and about home robots.

About home robots

There is such a researcher, Jeffrey Moore, who wrote several excellent books about the laws of the development of technological industries in general and technology startups in particular. One of them is Jumping Abyss . Maybe one of you read it. Highly recommend reading. And the second, his next book about the same is Inside the Tornado . He introduces the concept of "technological tornado" in it, the explosive development of a new technological industry. Such a “tornado” appeared before our eyes several times, when there was a huge surge in a certain industrywith the advent of a new breakthrough technology. And there arose a lot of users, money, manufacturers and so on. Careers were made, careers were destroyed, fortunes were made, the way of life of millions was changing, and so on.

We have seen such tornadoes a few times . This, in particular, the creation of a personal computer , to which Jobs and Gates had a hand.

Next, this is a software tornado ., that is, programming, fashionable in the late 80s and early 1990s, when any girl thought it was an honor to walk with a programmer. Then in the mid-1990s there was an explosion of the Internet. Again, giant companies appeared from nothing, from scratch, dizzying careers and fortunes were made, loud crashes and all that happened. Then we saw mobile phones, but they were there, a little away from us IT people. Well, then, here you are - smartphones and tablets.

In all these cases, there were general processes and patterns. The most interesting thing for us is what the state of the future industry looks like “on the eve of the tornado”.

The state "on the eve"

What happens on the eve of such a tornado?First of all, this is the absence of any market, and most importantly - it is a fog in the brain. Everyone feels that something is being prepared, but no one knows what. In particular, such a story was when a personal computer entered the stage. Dmitry Grishin said that Visicalc supposedly pulled them all out and there was some kind of game, but this, of course, is not true. Fortunately, he is young and did not manage to participate in this tornado. I still managed to capture its end, because I started in the mid-80s. I then began to program the spelling project, which we then built into MS Word, around the 87th year, when the first personal computers only appeared in the Soviet Union. Of course, there wasn’t because there was an explosion because a specific application was created. And here is why. When all this was still done on the knee in different garages, all these people knew each other, they all met at the same parties. Conditionally speaking, there were 17 models of different, incompatible computers, some of which, for example, did not have a screen or had such a small green screen on luminous wires, some of them were programmed by twisting pens, verniers (there was no keyboard, and so on) .

Method of use - the main condition of the tornado

What happened Why did this tornado begin? Because someone (and these were pretty charismatic people, very energetic and so on, like Jobs) suggested a way to use this device .

From my point of view, everything is decided by the way of use. What it is? I will explain now. The way to use it is what Jobs did with smartphones. Smartphones were made a few years before the iPhone. Nokia has made these smartphones in countless numbers. I bought them several times. They were quite painful, it was hard to use them. Some of them were already with a touchscreen. Jobs essentially did not invent anything, all the elements had already been invented and implemented.

This is also one of the signs of an approaching tornado that everything already exists in technology.
Maybe they are not standardized. There are no building blocks, as they said today. But, nevertheless, it is not necessary to invent technologies.
We need to bring them together and say - this thing, which is called so-and-so, and that's how it needs to be used.

To offer, in fact, a use case, a paradigm. I personally call it "method of use."
It is necessary to give users this concept, to give the manufacturer a direction of development. Then there is a market and users. And then a technological tornado comes and explodes over their heads. It is clear that in robotics, in particular, in the industry of home robots, with which I personally deal, this has not yet happened.

Here's another example - the automotive industry. She was in exactly the same condition “on the eve”, until Ford built his conveyor and began to produce the car that we know. In fact, the car that it produced is not much different from a modern car, in terms of its main components and functions.

Then there were, conditionally speaking, 300 garages around the world where they did anything: two-wheeled, six-wheeled self-running cars, with a windshield, without it, with doors, without doors, with a steering wheel, with a joystick, with handles, whatever , with a roof, without a roof, with one armchair, two, five and so on. What did Ford do? He proposed the concept that we know, that is, the car, most of the elements that they now have.

And he managed to impose it. That is, in fact, in order to impose a method of use, you need not only the pleasant, understandable, obvious way to use the thing, which is a fairly contagious mental virus, but you also need the charisma of the founder and, probably, still quite dense inoculation of this virus so that the infection density is immediately large.

For Ford, it was just a conveyor that produced a lot of cars at a relatively low price. Jobs also had such opportunities to release many iPhones at once and advertise them very much.
And after that, here it is this cloud of the most diverse freaks ... I apologize that I did not begin to draw all this, here are all the reports with pictures. But it seems to me that smart people and words are enough. All this is a huge variety of the most incredible models of the most incredible devices, everyone is compressed into a trunk of very similar products and business models, inside which real competition begins, real business and so on.

Here's another example - the Internet before Jim Clark. Jim Clark, maybe none of you know or not everyone knows, but this is the person who, in fact, made the Internet. He hired the author of the first non-commercial browser, made a real browser, namely Netscape Navigator, and then everything exploded. Not before that. Yes, the Internet itself was already growing fast enough, like a weed, there was already a Mosaic browser, but nevertheless the Internet was Jim Clark who made the Internet that we know (he’s an extremely unusual guy, he made Silicon Graphics, which is responsible for all the modern special effects of the films that we know).
Accordingly, Kalashnikov made assault rifles what they are now, etc.

There are probably some real, technical, financial, pragmatic restrictions on the way in which things can be imposed. Technically and financially, it is probably less profitable to make a six-wheeled car than to make four wheels. Nevertheless, the generally accepted way of using things is very often strange (for example, Chinese sticks and hieroglyphs, ties or watches), but it’s just that historically and that’s all, and we all use it without asking why.

Home robots on the eve of a tornado

I want to talk about a home, personal robot, and that's why. It seems to me that things like remote presence robots or fire robots, military robots or quadrocopters, or even some kind of kinetic robots, toys, they are niche, that is, a real technological tornado in robotics will develop where millions will be involved, or rather - hundreds of millions of users.

What it is? Most likely, it will be some kind of personal robot. We are fantastic, Hollywood, as well as fantastic books have already shown what it is. This is some kind of robot that lives with people, like a butler, like Jeeves or Jarvis and in everything participates in family life, in human life and so on. Small explosions will occur in many places, now an explosion is already taking place in the area of quadrocopters, more precisely, controlled / autonomous copters and so on. But not everyone will have such a copter. But a personal robot, just like a personal smartphone, in the end strives to be with everyone; personal computer too.

So, to participate in a tornado with robots, most likely, it will be a personal robot. Everyone will have such a personal home robot or will strive for it. This is where the next tornado will happen. This is what is called the next big thing in English, the most important next thing.

What now? We see just such a classic state of "The Eve", that is, the fog in the brain, no one understands what a home robot is, there are a lot of them, they are all kind of unnecessary, they are all overcome by the curse of the dusty corner.

Curse of the Dusty Corner

That's how many I have not seen robots that buy for children, my friends, relatives and myself home, the curse of a dusty corner lies on them all. Not later than in a week, two, this robot is discharged in a dusty corner and no one else is engaged in it. Because it is boring, because it is too lazy to charge, because it is not needed.
For some reason, no one has yet overcome this barrier: to make a robot that would last at least a year.

Yes, there are robotic vacuum cleaners, this is a very niche thing and some people who love order very much and who are not too lazy to get this robot out of the wires, do it and so on. Vacuum cleaner robots live relatively long, but with most of my friends they just got into the dusty corner - just after a month or two, and not after two weeks. Only very boring and thorough people have a working vacuum cleaner a year after purchase.

But there is an interesting hint that shows the general direction. One of my company partners said:
“My sister also bought a robotic vacuum cleaner from me; she says that he’s cleaning, of course, but the most important thing is that grandmother now has someone to talk to. ”

It is necessary to understand, at the same time, that the robotic vacuum cleaner doesn’t speak at all, it sometimes says in a nasty voice something like: “Error 502” when it gets confused, or doesn't say at all, but squeaks when it gets confused somewhere or something something broke. That is, this is a grandmother talking to him. When the grandmother is alone at home, the young went to work and in the kindergarten, she wants to talk with someone. Here’s a robot, he’s driving, she’s probably saying to him: “Oh, you are my little one, you’ve got confused, now, now,” she is talking to him, there is someone to take care of.

And, of course, since there is no method of use proposed and accepted by the masses of buyers, there is no industry and no market. And it won’t be long before this very method of use is offered, when everyone does not say “Well, this is the robot, the real home robot.” When the question is solved, should it be anthropomorphic? I’ll say more about this later. But should he be kinetically capable, that is, move around the apartment, or can he stand on the table and not move when it is clear whether he is talking, should he recognize speech and what is inside him, but more about that later.

Unfortunately, this method of use was not proposed, which is why I initially said that those funds that are currently invested in this industry are not really investments, they are tuition fees, research fees, research fees, fees for to understand what’s inside, and what this way of using a home robot can be.

Interlocutor or assistant?

With a home robot, there is a fork in three roads.

The first is various niche applications. It is clear that in the end, a robot talking will be built into every microwave and washing machine. Already now you can see that some speech is embedded in some payment machines in some shopping centers, and when there in general noise you pay for the phone (I really didn’t do this for a long time, but when I did, I heard), there It turns out, in a weak, hard-to-hear voice, this machine is commenting on something for you. This is such a thing known in psychology, called “commented action”, a very important thing for training and education.

This “robot” comments on something there, its actions, but of course it doesn’t recognize your speech, it’s technically impossible in such conditions: usually in a noisy shopping center you also almost don’t hear it. In the end, of course, instead of reading the instructions for use, you can just talk to the washing machine, I think. Or with a home robot that knows how to control the washing machine, this is one of the consequences of this fork, but, most likely, all manufacturers will eventually integrate the speech block into all their devices, especially those devices that are more complicated, which require understanding instructions.

As you know, people use the capabilities of even a simple video player by about 2-3 percent, no one understands what is in it and when they look at the remote control with fifty buttons, most people just don’t understand what is written on the buttons and why they are needed. Now, if this can be replaced, then it will be used; the latest smart TVs are already trying to replace this with communication - they already understand their voice, gestures can be controlled, but in general it will be in all consumer electronics. And in toys, I mean tumblers of all kinds, plush toys, there will certainly be a speech interface - now they are already talking, some of them even already exist. Unfortunately, now everything is so poorly done (by the Chinese, of course) that these toys also fall into the dusty corner.

Second opportunity:home companion, that is, it is an interlocutor, something talking, maintaining a dialogue (possibly with secretarial functions).

The third option: this is a domestic worker, an anthropomorphic android. This is what occurs in fiction, what can be seen in Spielberg's books and films. You must have watched the movie Artificial Intelligence. We are shown how a boy lives in the house as a child, very similar to a person who, of course, has emotions, problems with people and so on. It seems to me that this is just a technically complex and not quite necessary thing.

Even if this is just a domestic worker, then physical activity will be required of him, he must accurately take things, move among furniture and stairs, confidently recognize three-dimensional scenes. To it one could add security, call special services and also control equipment. Personally, I have an unambiguous opinion: a home android assistant is just a mistake, it is the result of an induced illusion from Hollywood films, it is monstrously difficult to develop and it is not clear why.

That is, it is shooting from a cannon at sparrows. As I usually say, if you need a robot servant, it’s easier to take a remote presence robot, supply it with manipulators and hire a Filipina who will do all this work for 10 thousand kilometers, but you don’t need to build monstrously complex artificial intelligence in this robot, dear cameras and everything else. And be afraid that he will step on someone or break something, because there is a mistake in the program. And this Filipina will not seduce her husband, she will not steal jewelry, and so on, because she is located over 10 thousand kilometers.

There are quite a few such unjustified illusions from fiction. It is known that in all science fiction books and films there is a videophone and everyone talks to screens, and they show themselves to the interlocutor. As we can see, there is already video communication technology, but the videophone has not taken root. It was an illusion. Well, some people use Skype sometimes. But, most likely, there will be no videophones. Not because the connection is bad, the video is of poor quality. All this is already and of good quality, albeit expensive. But because he is not needed.

Who needs people to watch in the house all the time, what are you doing? A lot of people turn off Skype, not only because it disgustingly works with video, but also because they do not want to show themselves. They just talk on skype, like on the phone. It turned out that the need for a videophone, in general, is not so universal, but in all films and in all science fiction books it is.

It seems to me that anthropomorphic robots from the same category. In fact, there is no need for them.

Interlocutor or assistant?

As for the home companion, there is the following fork - it is communication or help .
That is, either an interlocutor, for communication, for conversations, with whom to talk, like that grandmother. Or an assistant, very useful, with secretarial functions, with business functions and so on and so on.

In fact, we can see this at existing startups. Someone makes an assistant on a smartphone and tries to score more functions there, someone is engaged in maintaining a dialogue in a natural language.
Here we are, for example, engaged in a dialogue on it. At startup, Lexi is trying to teach him, first of all, how to communicate.

Now I’ll say a few words about the pros and cons.

Assistant - Pros and Cons

There is a problem with the assistant. It is not only difficult to program, because you need to make a bunch of connectors to data sources. The problem is that this is a high responsibility. You take responsibility for the accuracy of the information, but it cannot be properly provided. For example, when connecting to various services on the Internet, well, either you must have a very tight contract (SLA) with them and therefore pay a lot of money, or you have information that may be inaccurate or simply not available at that moment, or something else that.
But the main problem is how to use it. After all, there is already experience in creating assistants such as Siri on the iPhone and others are the same, quite a lot has been done. They all failed because they are excruciating and disrupt the usual uses. If you want to light a light bulb, for example, as someone here explained to me, in a smart home using a smartphone, you need 10 seconds: unlock the smartphone, launch the application, tell it to “light the light” if it is connected to this wi-fi light, he will light it.

But no one is looking at the screen in order to talk to him. We have a "generation of thumbs." It has already formed. All convenient buttons, all convenient movements are made under it. It is completely incomprehensible why you need a screen into which you already stick, and at the same time you are talking to him.

As a result, most of these applications end up in the same dusty corner. Download them a lot, Siri generally stands on any iPhone. They don’t give such data about Siri, but, according to rumors, like everyone else who makes these assistants, their life time is about one and a half to two days, approximately. They do not use them further.

Steadily use in those cases when you have only an audio channel. This is a car, for example, when hands and eyes are busy. Then, in fact, you almost do not look at the screen when not only your hands, but also your eyes are occupied, you have to look at the road, and then these things work. Yes, it works here, but this is a completely different application, and generally speaking, apparently, the audio channel is what the interlocutor should capture, that is, in the correct way of use, he should not have a screen. Here, in particular, about the method of use, the first such more or less reasonable consideration is that a screen is not needed. If you have to look at the screen - then, I'm sorry, it's easier to click on the touchscreen or work with the mouse and so on.

Communication effect

As for the virtual interlocutor, that’s why I do it. I made some virtual interlocutors in the early 2000s, very simple, and then we still used AIML - artificial intelligence markup language, which is still used by a group of enthusiasts around Professor Richard Wallace. This is a famous Alice project.

Now we have our own language. I made the company Nanosemantics, which makes custom-made robots. She is even already profitable. She makes all kinds of promoters, employees of those. support and so on. For business applications.

So, then, in 2002, a very interesting episode occurred with our extremely simple interlocutors, which struck me: people very strongly transferred the personality to these interlocutors, that is, they perceived them as a personality even knowing that it was a robot. And many thousands of people passed the Turing test on these interlocutors en masse. That is, there were such stories when people wrote to the company, they talked with this robot: call me Vasya, this is my son, he works for you, if you don’t do this, I will get fed up with your bosses, they will fire you.
That is, we have seen this many times, but we read the dialogues.

Then we still made such a public service www.iii.ruwhere you could make your interlocutor. There, something in total under 2 million of them was done. And we read the dialogs. Quite a lot of people talk with these people completely seriously. And when we built such an interlocutor in ICQ, until ICQ banned us for gigantic traffic, there were sessions of incredible length ...

The average session was several hundred replicas, and the maximum was 1600 replicas. This means that a person talked for 10-12 hours continuously. Moreover, this interlocutor was not “useful”, it was just such a joker, an office plankton, who was talking about what he wants to do on the weekend and so on. Even those who knew that this was a robot still talked to him, because communication is addictive. It is contagious. A person’s communication matrix is superimposed on many things, including communication with a robot. And this is a very powerful phenomenon that can be used.

Here's a previous speaker talking about visual communication through movement. I do not know if this phenomenon is there. May be. Maybe if the robot speaks body language, it will also be addictive.

But with speech, with the text, this is exactly so. Responsibility there is lower than that of a very serious assistant, because you can talk informally, and if you do not understand the line, you can change the subject and so on.
The experience of existing communication systems is such that virtual interlocutors are addictive, but for now, they also eventually get bored. This is because they have too little fullness. This circumstance needs a little more explanation. A virtual interlocutor is also a recognition system, like search engines. But if accuracy is important in the search engine, that is, ranking in the top ten, and completeness is absolutely not important, because on the Internet there are always 10 million pages about what you asked, then the virtual interlocutor is exactly the opposite. His accuracy is always 100% - if he understands the question, then he answers always relevantly. But completeness is very important, because the variety of what a person can ask him is so great that he never manages to cover it. It is necessary to build such concentric circles from an exact answer to a reasonable one.
Let, conditionally, the inf (virtual interlocutor) is talking about bank loans (in our Nanosemantics there are really big client banks). And then they ask him about Putin: what should he do?

It is clear that inf must recognize that this is politics. He should not have answers about Putin (and because it is both expensive and wrong from the point of view of business tasks). So, inf should say: you know that, let's go back to your mortgage, I don’t talk about politics. That is, the answer, if it cannot be exact, must be reasonable. Here you have to build such a system of “reasonable answers” and recognition of topics in order to cover all large areas in which you do not have an exact answer, but there is a reasonable answer.

It is clear that, most likely, it is correct that the virtual interlocutor has useful functions. I’m talking about positioning: just a “helper” should bite into these useful functions very much, but a virtual interlocutor, of course, should have some “useful things”. He should be able to answer what the weather will be tomorrow, how much time, if there are traffic jams, etc. In general, my personal choice, when I began to invest in something in robotics, this is an interlocutor (we call them info) with elements of an assistant.

Problems of developing a home interlocutor

What are the development problems? Speech recognition on board. Most talking projects, in fact, are sitting on Nuance recognition or on Google, which you need to access the network. This is usually 3-4 seconds of delay, this is an impossible time, no dialogue is obtained. A person at this time begins to get nervous, asks the next question, he disappears altogether unheard or the interlocutor gets confused, begins to look for an answer to him, more precisely, his recognition ...

In general, in fact, the situation with recognition in the world is very bad: almost everything was captured by Nuance, it is a monstrous patent troll that devours everyone and then rolls out huge prices for all its services. Well, now Yandex has released its API, let's see. There are a couple of companies in Russia that Nuance simply did not reach for. I hope that our state will protect against it, in the end.

Surround sound recognition few people do. In fact, it’s one thing when someone shouts Google directly at the smartphone, and another thing is that you are recognized in the room from any direction and distance.
Because the essence of such an interlocutor is capturing the air channel so that you can wake up, and when you do not have a screen at hand, you could ask what you need in the air. What time is it? How is the weather there? Or something else. Do I have an email?

But in order for you to be recognized, relatively speaking, it is necessary that there should be an interlocutor in each room, or intellectual ears from this interlocutor in each room, and he would be standing in the kitchen. But, in any case, this surround sound in more or less such a home volume of 3-5 meters must be recognized, and this is a rather serious problem. It is clear that it is necessary to recognize the interlocutors, that is, it would be good to recognize, identify by voice, no matter how hoarse there is, to call him by name and so on.
And a home robot, it means - he is dealing with a family, as a rule, and not only there with a lonely geek who bought this thing to try.

It is clear that there should be a variety of communication, that is, games, different topics. He must seize the initiative, that is, linguistic intelligence should be quite serious. We need a smart user model, that is, the interlocutor must learn, he must learn more about his masters, he must remember something about them. Building a user model is a pretty serious theory, which, in fact, few people know how to do. We ourselves also think a lot about this, because how to remember the right and forget the wrong, how not to retrain at the same time is quite a serious thing.

Self-learning and self-updating, I already said that, that is, of course, the person you are talking to should also download updates when he sees the network, but he, generally speaking, should work without the network - otherwise it makes no sense. Work with the Internet or everything on board? Everything should be on board from my point of view, and the Internet is needed to download updates.

Speech synthesis and intonation, in fact, are also not very solved tasks. While the robotic voices are pretty lousy, monotonous. By the way, there is a very unpleasant effect. If you have ever listened to news reading, you might have noticed a very interesting effect on the human brain. I noticed him when we did synthesis for dictionaries and for news back in 95-97. You listen to the news in a synthesized voice, and at some point you notice that you do not understand anything that he reads. You hear, as it were, the clatter of Russian speech, and you cannot understand. The brain suddenly said: “Ah, I understand, you are deceiving me, it’s not the person who says it, I refuse to understand it.” This is the real effect. The same thing with letters. We synthesized the text of the letter. We had a "Writer" who synthesized a letter. The same story: with some letters the eye jumped slipped off, could not plunge. The brain refuses: he feels something unnatural, refuses to understand. Therefore, we need very good intonation, effects that break up the dialogue into small pieces.

In short, the way to use all of this has yet to be formed - what is a “home robot”. Perhaps there will be some kind of dynamic activity in it - driving around the house and so on. Perhaps there should be machine vision in order to at least recognize the owners or discover that someone has entered the room and is silent in order to start a conversation with him. Maybe you need to recognize pets.

The functions of the butler should be - control of equipment and all that. Governess - it’s not at all difficult to integrate the simplest training into info. Arithmetic or something else is likely to be. Secretarial functions, for example, "Tell your wife that I will be later, but buy bread." An analogue of notes on the refrigerator, of course, should be. An answering machine, an alarm clock, probably should be done anyway. And the physical houseworker, it seems to me, is ridiculous. It will be so expensive that it’s much easier to just hire a grandmother.

Entertainment, feeding and walking animals is likely to be a niche business. These will be some special robots that do not have anthropomorphism and even intelligence in general. They should have intelligence at about animal level.

Sex is a very hot topic. I do not think that this is in our niche and that it will spread widely. There will probably be some dolls in sex shops with artificial intelligence, as in a famous joke, here I will not risk telling him. Most likely, it will be some kind of niche, and it certainly will not be inside this technological tornado.

Well, anthropomorphism personally raises big questions for me. Maybe there will be the same effect of the transfer of personality, but, unlike text and even voices, robots that fake the appearance of a person so far cause repulsion and shock. Just like a monkey is very similar to humans, but it is still not so, and it seems ugly. If we tried to take her for a person, it would be a shock. If they saw a man very similar to a monkey, it would cause a feeling of terrible ugliness. It seems to me that with robots it will be the same. In general, I am personally against anthropomorphism, it seems to me that this is a dead end branch.

We are waiting or offer

That is all I wanted to say. We are waiting for the appearance of this method of use - someone must produce it, impose it on everyone. This should be either a person very charismatic, or a person with huge amounts of money for advertising. And then everyone will build a pig behind him and will produce about the same thing and compete within this paradigm.

And the existing efforts of robotics will not be lost - they will be used as niche applications or in line with the overall development, in the central "trunk". Qualifications and experience will also not be lost. As they say, "there is no superfluous work."