Speech recognition in Asterisk using Yandex SpeechKit HTTP API
The article is written based on the Synthesis and speech recognition from Google for Asterisk , with
Everything is unchanged in the dialplan (an example for extensions.ael, in my AEL is more convenient than extensions.conf):
s => {
Answer();
Wait(1);
Record(/tmp/${UNIQUEID}.wav,3,20);
AGI(yandex_voice.php,/tmp/${UNIQUEID});
NoOp(${TEXT});
Hangup();
};
The example is very primitive: we answer the call, wait 1s., Record the speech, recognize what was said, display the recognized text in the Asterisk console, but the principle of operation is clear.
Now for the script itself.
First, a little about the variables used:
$ key = 'my_secret_key' - this is your API key; you can get it by writing a letter to speechkit@yandex-team.ru;
$ topic = 'maps' - a topic for recognition, the following options are possible:
• freeform - free text, notes, etc. Variant of application: we translate the voice mail message into the text and send it by email or SMS.
• general - web search queries, I can’t think of what this can be applied in this context;
• maps - addresses, GEO points (name of bars, gas stations, hotels, etc.), etc .;
• music - names of songs, music groups, etc.
$ lang = 'ru-RU' - the language in which recognition will occur, currently Russian 'ru-RU' and Turkish 'tr-TR' are supported, and Turkish is supported only for the themes “general” and “maps”;
$ uuid = '12345678123456781234567812345678' - 32-digit string, must be unique for each request.
The API is described in more detail in the Yandex_SpeechKit_HTTP_API_May [5] .pdf file that will be sent to you along with the key, although a shorter API manual could not be read, but this is for the better.
In my version of Asterisk, the script file is located in the folder:
And actually the yandex_voice.php code itself:
#!/usr/bin/php -q
(.*)!si', $res_xml, $arr)) $voice_text = $arr[1];
else $voice_text='';
echo 'SET VARIABLE TEXT "'.$voice_text.'"'."\n";
fgets(STDIN);
echo 'VERBOSE ("'.$voice_text.'")'."\n";
fgets(STDIN);
exit(0);
?>
Yes, the code is not perfect, it can and should be improved. As an option, make it more universal by passing it most of the variables with arguments or use it as a function in another AGI or ARI script. As it is now used by me, to recognize the city in which the subscriber is located.