Voice “fingerprints” now officially work (and how the implementation process at Priorbank looks like)

Published on August 11, 2016

Voice “fingerprints” now officially work (and how the implementation process at Priorbank looks like)

    “But didn’t you get bored there at the bank to answer anonymous questions?”
    - No, Vladimir Petrovich, not a bit of a bitch.

    One of the largest commercial banks in Belarus, Priorbank, a member of the Austrian Raiffeisen group, uses voice standards (or, as they say, voice “fingerprints”) of customers to confirm their identity when contacting by phone. This is only the second case in Russia and the CIS, when the bank officially announced the use of such technology.

    We already talked about voice biometrics (the ability to "recognize" and determine the identity of the caller, for example, to the contact center of the subscriber, even if he uses another phone or appears to be someone else - this is relevant for antifraud). I’ll tell you about the features that exist in the implementation of voice biometrics using Priorbank as an example.

    Why is it necessary

    The number of customer requests to the contact center of the bank with requests requiring personal identification is growing. So, if in November-December 2007, 46% of calls were recorded with the need to request personal information, then in the same period of 2015 the number of such calls increased to 68% of the total number of all calls. Voice over authentication technology significantly reduces customer service time.


    In voice biometrics, there are a couple of concepts that are often confused. Let's figure it out. Voice identification allows you to understand that the person calling from an unfamiliar number in the contact center can be Vasya Petrov. As part of the identification, a coincidence of one sample of the voice is checked with many of the voice database available in the contact center.

    Voice verification (aka authentication) allows you to confirm a person’s identity by phone. That is, with a certain degree of probability, to suggest that Vasya Petrov is really Vasya Petrov. As part of the verification, two voice samples are compared: the voice of the person whose identity is to be verified with the voice stored in the database of the system and whose identity has already been authenticated.

    All biometric systems are probabilistic, and none of the existing systems can guarantee the absence of errors.


    Errors are of the first and second kind or, otherwise, a false pass of the “alien” (False Acceptance Rate, FAR) and errors of false denial to the “own” (False Reject Rate, FRR). These errors are interrelated, and setting up biometric systems requires finding a “compromise” between the levels of these errors in order to satisfy the task as much as possible.

    Voice biometrics can be applied in various applications. For example, to verify users when talking with a contact center operator, automatically verify voice by IVR, provide the user access to a mobile application (used in conjunction with other types of biometrics: fingerprint or face), identify fraudsters by voice (antifraud), etc.

    The project on the introduction of voice biometrics in Priorbank was implemented during 2015. The key task is to confirm the identity of the client over the phone during a conversation with the operator and reduce the verification time. As a result, improve the quality of remote customer service and increase the level of protection of banking information.

    Each month, Priorbank's contact center operators process tens of thousands of calls, more than half of the requests are requests for information on accounts and operations. Main questions: “why the card doesn’t work”, “what is the loan debt”, “why the operation doesn’t go through” and so on. These issues cannot be resolved without confirming the identity of the client - according to the legislation, information on accounts is provided only to their owners.

    Such information was repeatedly tried not by the account holders, but by third parties. That is why the person who called the bank is always asked for passport details, card number, mother’s maiden name and so on. Clarification of these issues takes an average of 30-40 seconds per client.

    Within the framework of the project on introducing voice biometrics, the solution of the Center for Speech Technologies company was chosen, since the biometric platform of the Russian developer has already proved its effectiveness in verifying the identity of bank customers. Last summer, a solution based on it was introduced in the Wells Fargo mobile application, the world's largest bank by market capitalization.

    The principle of the system at Priorbank

    • With each call (incoming or outgoing), at the moment of the beginning of a conversation with the operator, a user check is started in the background and data on his voice is collected.
    • In real time, the biometric parameters of the voice are measured and compared with the previously saved standard. The whole process takes a few seconds.
    • The result of the confirmation of identity by voice appears on the monitor screen of the CC operator.

    The process is so reliable that it allows you to distinguish, for example, the voices of twins or the call of a parodist. In a few seconds, the system will verify the voice with the standard and inform that the verification failed. The system is language independent, therefore, a bank client can speak any language that is available to him.

    Implementation steps

    • Development of technical specifications, design and implementation.
    • Integration with CRM.
    • System calibration
    • Testing / trial operation.
    • Submission of work.

    There are several processes when working with voice standards:

    • initial filling of the database with voice standards of customers based on the records of their calls to the contact center. Moreover, when creating the standard, the client must confirm his consent to his (standard) record;
    • updating voice standards;
    • Verification of customers by voice.

    A database with voice standards of customers was formed in the process of their contacting the bank’s contact center by phone. If during a conversation with the operator, the client could confirm his identity in a standard way (according to passport data, secret words, contract numbers, etc. details), then when a sufficient amount of speech material was dialed, the system created a digital standard based on the unique features of his voice: accent and the speed of speech, pronunciation, etc. Also taken into account and physical features: speech tact, shape and size of the mouth, the structure of the nose. Thus, already at the next call to the CC, the client verification procedure was significantly reduced in time due to background identification by voice.

    More details

    How is the process of registering a voice standard

    • When making a call, the client is identified by the phone number (if he is calling from a mobile phone whose number is registered with the bank).
    • Then he goes through the standard authentication procedure based on knowledge of the name, date of birth, passport and contract number, passphrase. In general, right up to the name of the cat.
    • During a conversation, the system accumulates the amount of speech necessary to create a voice standard (usually about 40 seconds of pure speech), and when enough is collected, it informs about the readiness to create a standard. The operator presses the button, and if the information that the client has reported about himself coincides with the data from the bank's systems, the operator saves the voice standard. Otherwise, the save does not occur. The bank receives the consent from the client to create a voice standard in advance, at the stage of signing the service contract by the client.

    How does the verification process go?

    The client calls and is identified by the mobile phone number. If a voice standard has already been created for it, then the system will start:

    • verification procedure, if not, then see the history of the registration of a new standard. The operator asks a few simple questions to the client (asks to introduce himself and indicate the date of birth).
    • Within a few seconds of the conversation, when enough of the client’s clear speech is accumulated (7–9 seconds), the system compares his voice with the standard and shows the result to the operator (“own”, “alien”, “not sure”).
    • The operator either finishes the survey or continues, because the system is “not sure” that it is his own, or gently refuses the client, because he is “stranger”.

    If necessary, the operator can manually restart the biometric verification of the client, for example, if a third party interfered with the conversation during the verification process.

    After the verification procedure is completed, the system no longer “listens” to the conversation, and its resources are not used.

    Implementation Features

    To determine the solution, three thresholds are configured in the system (“first”, “second” and “enrichment”). If the comparison result is above the first threshold, then the system considers that the client is “own”, if below the second - the client is “alien”, if between - this means that the system cannot be sure of its decision.

    If, when comparing, the result is higher than the first threshold and above the “enrichment” threshold, then the system automatically updates the voice standard. This allows you to keep the standards up to date.

    Fighting scammers

    It would be a big omission not to say that, among other things, voice biometrics helps fight fraudsters. These are the so-called antifraud systems. This subsystem has not been implemented in Priorbank, but I’ll tell you about it anyway.

    According to Aite Group, scammers can get from 20 to 50% of all security questions. Within a few seconds of the conversation, the client is automatically verified and the caller is not a fraudster.

    It works like this. In addition to the database of voice standards of ordinary clients, a blacklist of fraudsters is created, where voice standards of villains are entered. When you call the contact center, the client’s voice is compared with one of his standards in the client base (verification) and with all standards in the database of fraudsters (identification). Naturally, if it is found in the database of fraudsters, the scenario for servicing such a client changes.

    The authentication solution plus anti-fraud system together provide up to 90% of fraud detection with a false positive rate of 0.1%.

    In biometrics, as I said earlier, there are key percentage probability indicators (thresholds):

    • FAR False Acceptance Rate (“system demanding level”), that is, the probability that the system will let a “stranger” person;
    • and a false failure of the FRR (False Rejection Rate, errors), that is, the probability that the system will not let "your" person.

    The indicators are very closely related. The FAR value is called specificity, the FRR value is called sensitivity. By increasing / decreasing the sensitivity of the system, we lower / increase its specificity and vice versa.

    If we talk about False Reject, then in the classical scheme with questions, the percentage of failure is about 10-15%, with the additional use of authentication in real time, this figure does not exceed 4%. As for False Acceptance, in the standard scheme fraudsters authenticate in 15–20% of cases, additional online authentication allows reducing this indicator to 1-3%.