Alice will help developers find objects in user requests. NER in Dialogues

    In the spring, we launched the Dialogs platform , which allows us to create skills for Alice and recognize voice requests from users. Initially, skill developers had to parse requests on their own. For example, find the address in the text. Now this part of the work platform takes over.

    Today we will tell Habr's readers about the recognition of named entities (Named entity recognition; NER) and new opportunities for developers of skills.

    We believe that the future is in voice interfaces. Already, in many cases, users prefer to use a voice rather than an on-screen keyboard. For example, when driving. Or to search for quick answers to simple questions. Or to play in the "city" lying on the couch. But to make such scenarios more and more, simple voice recognition in the text is not enough.

    Voice interfaces are similar to search engine queries. We do not always understand exactly how to formulate your query in order to find exactly what you need. At the dawn of the development of the Internet, this was a big problem, because search engines were only looking for the exact occurrences of words from a query. So with the voice. If we do not know which team is expected of us, then we will guess for a long time.

    A good voice interface should not drive a person into a dead end. Usually, skill developers solve this with two tricks. First of all, the expected answers are suggested using the buttons on the screen. This is a useful practice that we recommend not to ignore.

    Also, creators of skills try to break complex questions into a series of simpler ones, the answers to which are easier to predict. Moreover, in some cases, users are required to pronounce words strictly in a certain form and case. The problem with this approach is that it can no longer be called natural communication. The more conventions and restrictions, the less voice control differs from the use of the keyboard and buttons. Ideally, the user should communicate with the service as freely as with the person.

    Well, when a user can say “Thank you! Deliver the order to Lev Tolstoy 16 and tell Sergey Sergeyev, ”instead of breaking it up into a series of questions about the street, house number, name and surname. But this will require the developer skill to further parse the responses received. You can do it manually by the operator, but with a large flow of operators you will need a lot. And the operators are unlikely to do it in real time, so the skill will lose the opportunity to clarify the missing information immediately. You can develop a technology that will automatically find important information in the text, classify it, normalize and save it. But this is a fairly time consuming task.

    To efficiently extract useful entities from the text and correctly classify them by type, the service should have experience in two important directions. First of all, you need to be able to collect knowledge about what objects are. If you do not have Lev Tolstoy Street in the “dictionary”, then when processing a request, it is easy to confuse it with the name of a person and skip it. On the other hand, it is equally important to be able to find these objects in the raw text from the user. At a minimum, take into account the morphology of the Russian language so that the word “Sergey” is found and turned into the name “Sergey”.

    It so happened that Yandex has a lot of experience in these areas. Search technologies are used both to search for new objects and to parse user queries. Now these technologies are available to developers of skills for Alice.

    Recognition of named entities in dialogs

    After a user utters a command, our platform recognizes its text and extracts words and phrases that describe certain objects. At the moment Dialogues recognize:

    - names;
    - location indications;
    - dates and time;
    - integer and fractional numbers.

    Information about recognized objects is sent to the skill server along with the user's response text. Consider an example:

    “Order pizza to Leo Tolstoy 16 for Sergey Sergeev at 10 pm”

    Our platform knows that Leo Tolstoy is not only a person, but also a street. It also takes into account that street numbers are often indicated in the addresses next to street names. Therefore, the skill request will contain the following block:

    "type": "YANDEX.GEO",
    "value": {
    "house_number": "16",
    "street": "льва толстого"

    Location indications can include not only the street and the house, but also a city, country or even an airport.

    With names it works about the same. Platform is able to find the name, surname, middle name and lead them to the nominative case.

    "type": "YANDEX.FIO",
    "value": {
    "first_name": "сергей",
    "last_name": "сергеев"

    Data normalization is an important feature for recognizing named entities. If for addresses and names this property is not so clearly striking, then with dates and time everything is much clearer. “10 PM” automatically turns into “22”. “Tomorrow” and “the day after tomorrow” explicitly increment dates.

    "type": "YANDEX.DATETIME",
    "value": {
    "hour_is_relative": false,
    "hour": 22

    Number recognition is also not to be underestimated. For example, “four point five” in the text from the user will turn into:

    "type": "YANDEX.NUMBER",
    "value": 4.5

    Interested? We invite in the documentation . If questions remain, then welcome to our chat in Telegram . For other news platform is convenient to follow the blog .

    Also popular now: