CalltouchForever September 4, 2017 at 12:04

Our rake on launching Calltouch Predict: 365 days of speech recognition and machine learning

For a long time, the “calling” market has switched from the “pay for the call” model to the “pay for the call that leads to sale” model. In the automotive business, these are calls to the sales department, in real estate, calls that attract new customers, in medicine, the initial recording of patients, and so on.

The importance of determining targeted calls is that in such areas up to 70% of incoming calls are not interesting for the marketer to configure advertising: these are calls from current or repeat customers, employees, various spam, etc. Based on the total number of calls, the advertiser will consider the source to be effective , which in fact does not bring new customers. In order to optimize advertising costs, calls need to be divided and marked up on those that lead to sales, and those that do not. As a result, the company has a choice: to entrust this manual labor of tagging calls to the shoulders of operators or sellers, or to use neural networks and other machine learning methods.

In mid-2016, we were the first on the market to launch a technology for automatically determining the quality and result of a phone call. Calltouch Predict is based on Yandex SpeechKit speech recognition system and the company's own algorithms. We heard quite a lot of skepticism addressed to us that this will not work.

In general, a year later we are ready to tell without which tagging calls really will not work effectively:

Recognition accuracy not lower than 90%

It seems to be an obvious statement that the automatic determination of the quality and result of a phone call is necessary in order to level the human factor when the operator works with CRM. If the accuracy is, for example, 80%, then such a system may not be implemented, but content with the work of the contact center employees. 2 errors per 10 calls - this is a common manual error. The technology should provide higher accuracy of call detection. What is needed for this? Everything is simple here: the more data for training the model, the higher the accuracy of speech recognition and typing calls. Transcription specialists need to spend time and manually decrypt at least a million calls. This will help to accumulate a large database and improve the language model, thereby reducing the percentage of word error rate (speech recognition errors) to a minimum. Our customers are satisfied with the lower limit of 90%. Therefore, now the system will not even begin to automatically tag if the accuracy is lower than this indicator.

Normalizer

The product should have a system that brings all word forms to a certain type (declension, conjugation, etc.) in order to search for uniform dictionary structures. A kind of language normalizer. This greatly improves call typing accuracy.

Optimizer

By itself, automatic recognition and tagging simply provides a set of data about calls that must be applied when optimizing advertising campaigns. It is logical that when the system automatically tags the calls, the contextual optimizer “pulls up” the calls according to the tags, and all this happens within one system. And not an employee brings it together, using different services. Why it is needed: the value of the target call for automatic bid management is higher than the value of the sales data: it is more important to have a large array of targeted calls that could lead to a sale than a small selection of actually completed transactions. Since often a quality lead can be inefficiently processed in the sales department.

Example of test (30 days) the Predict system in tandem with the optimizer content from the developer X:
Number of unique calls (calls from the new rooms) - 510
Number of unique - targeted calls (a call to the number by objectified duration.) - 410
The number of calls with the automatic tag “target” is 360

Therefore, the share of truly targeted calls among the unique ones is 71%, the share of truly targeted ones among the unique ones is 87%. It follows from this that 29% of calls are not targeted among unique, and 13% are not targeted among uniquely targeted. Using Predict, we were able to determine the share of true target calls, and then using the optimizer we reduced CPA by 48% and increased CR by 115%.

Antifraud

We can’t ignore the fact that in some industries there is a high fraud index: not always a call that supposedly leads to a sale (showing a house, recording for a test drive) leads to a deal, because it can be a fraud from an unfair advertising platform. At the time of the phone call, even if we automatically mark his “deal”, the client cannot be sure that this is not a fraud call from a script. Market segments with a high percentage of fraud include, for example, real estate sales (9%). For such companies, tagging works most effectively in the chain: “incoming call - call quality determination system - antifraud - tagging - optimizer”.

Identification of negativity

Product customers found it necessary to control the quality of employees who communicate with customers. In order to exclude manual labor, the automatic tagging system should highlight these calls in a separate category. In addition to commonly used negatively colored expressions (for this, elements of machine learning, such as hidden semantic analysis, the support vector method, and semantic orientation in this area are used), you can expand the vocabulary of lexical units that carry risks for customers. For example: court, FAS, Rospotrebnadzor, etc.

It is quite difficult to be the first in introducing new solutions: how much the rake will be - all yours. It took us a year to collect the first five. I think that there are still many discoveries ahead :-)

Tags: