Automate answers to frequently asked questions in a skill for Alice using DeepPavlov library
For over a year now, the Laboratory of Neural Systems and Deep Learning at MIPT has been making DeepPavlov - an open library for creating interactive systems. It contains a set of trained components for language analysis, with which you can effectively solve business problems.
For example, organize answers to frequently asked questions of customers. To do this through a call center, a widget on a website or a social network by hiring employees is a simple matter. The urgent task is to optimize the process so that it is carried out automatically, with minimal errors, and also in a convenient user interface. For example, in the voice assistant "Alice" from "Yandex".
In this article we want to tell how to effectively solve the problem of answering the FAQ using natural language processing and how to integrate the solution into Alice.
Classification of texts and how to do it
Create a question-answer skill based on the DeepPavlov
library Install the DeepPavlov library
Launch the skill on "Alice"
Conclusion
The problem of searching for a question that is close to a given question from a ready-made set of question-answer pairs is solved by algorithms for determining semantic proximity / text classification.
To solve this problem “in production”, there are two ways: you can hire an NLP specialist in the state, or you can outsource the solution.
Cons of both options: 1) the need for data collection, 2) endless iteration of model training and quality measurement, 3) serious requirements for the development of developers. Even the integration process of a ready-made language processing solution is not an easy task, not to mention creating it from scratch. Foreign cloud solutions (Google Assistant or Microsoft Cortana) offer a comprehensive solution to the problem of text classification (DialogFlow, Azure Bot Service), but there are still problems with scaling, linking to paid API services and support for the Russian language.
But cheers - there is an alternative: you can use the open software library, which greatly simplifies the creation of a solution to answer the FAQ in Russian and its integration into the voice assistant.
DeepPavlov is just such a library. It contains a set of trained components for language analysis, including text classification components. You can read more about the various components of DeepPavlov in the help .
Working with DeepPavlov does not require special skills from the developer, the library is free and provides ample opportunities for fine-tuning.
You can find all instructions for creating a skill based on a knowledge base in this tutorial. We recommend that you rewrite the code from the tutorial into a separate script and run the skill from the script.
To get started, install Python 3.6 and activate the development environment. Then install DeepPavlov.
The skill (skill) in DeepPavlov is an entity that, regardless of the functionality ( text classification , open-domain question answering , etc.) has a unified input and output format. Skills are created so that they can be combined into a single stack of a simple dialogue system, which, upon receipt of a request, takes the answer from the skill with the highest confidence.
Create an object of the SimilarityMatchingSkill class that responds to a user request based on a list of frequently asked questions.
The object of the SimilarityMatchingSkill class has the following parameters:
To start using the model, after training it is enough to load it with the following command:
The SimilarityMatchingSkill class simplifies access to text classification components. But if there is a part of the configuration that you want to change, you can do this by defining the edit_dict parameter. An object of the SimilarityMatchingSkill class (like any skill) takes three parameters as an input: a list of offers for classification, a list of query history and a list of states (in the case of SimilarityMatchingSkill, the last two can be empty lists).
A typical dialogue system usually contains several skills. To demonstrate working with multiple skills, we will create several skills of the PatternMatchingSkill class.
PatternMatchingSkill is a simple skill class that is called when a user request matches one of the elements of the patterns list and responds to random elements of the response list with the default_confidence confidence. You can manually configure the default_confidence parameter to prioritize skill responses.
The last step is to combine the skills into an agent and configure the skill selection parameter. The parameter `HighestConfidenceSelector` determines that the skill will be called with the highest confidence.
Next, start the server with the path for the `endpoint = 'faq`` requests and the connection port` port = 5000`
Please note that Yandex.Dialogs requires a server with an external IP address and access via https as a Webhook URL. For quick prototyping, you can use Ngrok - it allows you to create a tunnel to access your server from DeepPavlov on the local network. To do this, run
on your server with DeepPavlov. In response to this, two tunnels will be created, one for the http and https protocols. Copy the tunnel address for https, add the endpoint / faq to the link, the final link will be the Webhook URL for our Yandex.Dialog.
To test the interaction with the Yandex.Dialogs platform, go to dialogs.yandex.ru/developer and create a new dialogue . Set a unique name and activation name. For the Webhook URL, specify the link received earlier. Save the changes. To interact with the skill, go to the Test tab.
Well, now you know how to use text classification models from the DeepPavlov library to create a question-answer bot, how to quickly prototype skills using DeepPavlov and connect them to Alice.
By the way, the interfaces for connecting to Amazon Alexa and the Microsoft Bot Framework are also implemented in our library.
We welcome feedback in the comments. And you can leave any questions about DeepPavlov on our forum .
For example, organize answers to frequently asked questions of customers. To do this through a call center, a widget on a website or a social network by hiring employees is a simple matter. The urgent task is to optimize the process so that it is carried out automatically, with minimal errors, and also in a convenient user interface. For example, in the voice assistant "Alice" from "Yandex".
In this article we want to tell how to effectively solve the problem of answering the FAQ using natural language processing and how to integrate the solution into Alice.
Classification of texts and how to do it
Create a question-answer skill based on the DeepPavlov
library Install the DeepPavlov library
Launch the skill on "Alice"
Conclusion
Classification of texts and how to do it
The problem of searching for a question that is close to a given question from a ready-made set of question-answer pairs is solved by algorithms for determining semantic proximity / text classification.
To solve this problem “in production”, there are two ways: you can hire an NLP specialist in the state, or you can outsource the solution.
Cons of both options: 1) the need for data collection, 2) endless iteration of model training and quality measurement, 3) serious requirements for the development of developers. Even the integration process of a ready-made language processing solution is not an easy task, not to mention creating it from scratch. Foreign cloud solutions (Google Assistant or Microsoft Cortana) offer a comprehensive solution to the problem of text classification (DialogFlow, Azure Bot Service), but there are still problems with scaling, linking to paid API services and support for the Russian language.
But cheers - there is an alternative: you can use the open software library, which greatly simplifies the creation of a solution to answer the FAQ in Russian and its integration into the voice assistant.
Creating a question-and-answer skill based on the DeepPavlov library
DeepPavlov is just such a library. It contains a set of trained components for language analysis, including text classification components. You can read more about the various components of DeepPavlov in the help .
Working with DeepPavlov does not require special skills from the developer, the library is free and provides ample opportunities for fine-tuning.
You can find all instructions for creating a skill based on a knowledge base in this tutorial. We recommend that you rewrite the code from the tutorial into a separate script and run the skill from the script.
Install DeepPavlov Library
To get started, install Python 3.6 and activate the development environment. Then install DeepPavlov.
source activate py36
pip install -q deeppavlov
Skill development
The skill (skill) in DeepPavlov is an entity that, regardless of the functionality ( text classification , open-domain question answering , etc.) has a unified input and output format. Skills are created so that they can be combined into a single stack of a simple dialogue system, which, upon receipt of a request, takes the answer from the skill with the highest confidence.
Create an object of the SimilarityMatchingSkill class that responds to a user request based on a list of frequently asked questions.
from deeppavlov.contrib.skills.similarity_matching_skill import SimilarityMatchingSkill
faq = SimilarityMatchingSkill(data_path = 'http://files.deeppavlov.ai/faq/dataset_ru.csv',
x_col_name = 'Question',
y_col_name = 'Answer',
save_load_path = './model',
config_type = 'tfidf_autofaq',
edit_dict = {},
train = True)
The object of the SimilarityMatchingSkill class has the following parameters:
- data_path - path to the csv data file (comma delimiter)
- x_col_name - name of the column with questions in the csv file (Question, default)
- y_col_name - name of the column with answers in the csv file (Answer, default)
- config_type is the name of the configuration you want to use for classification. List of all configurations .
- edit_dict - `dict` with parameters to be rewritten in the configuration of a specific config_type
- save_load_path - the path where to save the trained model
- train - whether to train the model
To start using the model, after training it is enough to load it with the following command:
faq = SimilarityMatchingSkill(save_load_path='./model')`.
The SimilarityMatchingSkill class simplifies access to text classification components. But if there is a part of the configuration that you want to change, you can do this by defining the edit_dict parameter. An object of the SimilarityMatchingSkill class (like any skill) takes three parameters as an input: a list of offers for classification, a list of query history and a list of states (in the case of SimilarityMatchingSkill, the last two can be empty lists).
faq([‘где будет школа?’],[],[])
A typical dialogue system usually contains several skills. To demonstrate working with multiple skills, we will create several skills of the PatternMatchingSkill class.
from deeppavlov.skills.pattern_matching_skill import PatternMatchingSkill
hello = PatternMatchingSkill(responses=['Привет', 'Приветствую'], patterns=['Привет', 'Здравствуйте'])
bye = PatternMatchingSkill(responses=['Пока', 'Всего доброго'], patterns=['Пока', 'До свидания'])
fallback = PatternMatchingSkill(responses=['Пожалуйста перефразируйте'], default_confidence = 0.3)
PatternMatchingSkill is a simple skill class that is called when a user request matches one of the elements of the patterns list and responds to random elements of the response list with the default_confidence confidence. You can manually configure the default_confidence parameter to prioritize skill responses.
The last step is to combine the skills into an agent and configure the skill selection parameter. The parameter `HighestConfidenceSelector` determines that the skill will be called with the highest confidence.
from deeppavlov.agents.default_agent.default_agent import DefaultAgent
from deeppavlov.agents.processors.highest_confidence_selector import HighestConfidenceSelector
agent = DefaultAgent([hello, bye, faq, fallback], skills_selector=HighestConfidenceSelector())
Next, start the server with the path for the `endpoint = 'faq`` requests and the connection port` port = 5000`
from deeppavlov.utils.alice import start_agent_server
start_agent_server(agent, host='0.0.0.0', port=5000, endpoint='/faq')
Please note that Yandex.Dialogs requires a server with an external IP address and access via https as a Webhook URL. For quick prototyping, you can use Ngrok - it allows you to create a tunnel to access your server from DeepPavlov on the local network. To do this, run
ngrok http 5000
on your server with DeepPavlov. In response to this, two tunnels will be created, one for the http and https protocols. Copy the tunnel address for https, add the endpoint / faq to the link, the final link will be the Webhook URL for our Yandex.Dialog.
Running a skill on Alice
To test the interaction with the Yandex.Dialogs platform, go to dialogs.yandex.ru/developer and create a new dialogue . Set a unique name and activation name. For the Webhook URL, specify the link received earlier. Save the changes. To interact with the skill, go to the Test tab.
Conclusion
Well, now you know how to use text classification models from the DeepPavlov library to create a question-answer bot, how to quickly prototype skills using DeepPavlov and connect them to Alice.
By the way, the interfaces for connecting to Amazon Alexa and the Microsoft Bot Framework are also implemented in our library.
We welcome feedback in the comments. And you can leave any questions about DeepPavlov on our forum .