Pavel 2.0: Reptiloid consultant on JS, node.js with sockets and telephony

    So our INTERCOM'18 has died down, with preference and business cases. As usual, the entrance to the conference was paid: anyone could buy tickets for the TimePad at full price, or ... get a discount from a reptilian consultant directly on the site . Last year, it worked like a familiar callback: you leave the phone in a special form, Paul calls you in a minute and asks questions; the more correct answers, the higher the discount. This time we decided to change the mechanics, making it more difficult both technically and in terms of issues. Under the cut - Pavlik's guts 2.0, with the current node and web sockets, do not forget to wear workwear before opening.

    Competition mechanics

    You enter from a desktop browser, Pavlik wakes up in the lower right corner in the form of a chat and offers to play the game. Enter the number, click "Here is my number" - after that Pavel raises the session between your browser and our backend.

    If everything has successfully risen and your number has not yet participated in the drawing, then Pavel will offer to call 8-800. Here the Voximplant cloud enters and the quiz begins:

    Answer: deadline. Based on the meme This is fine .

    Yes, the puzzles were like this. Three attempts were made for each question: first there was a “complex” picture, then simpler and at the end the simplest one. The first attempts gave the most points; after 5 puzzles, Pavel calculated the points and either gave a free ticket or a 10% -30% discount.

    At the same time, our reptiloid is smart enough: it showed error messages (if you entered the phone number incorrectly, for example), determined that the number was already participating in the drawing (“I see the familiar number on the screen of my non-existing mobile phone. One attempt in one hand is the rule. ") And, most importantly, correlated the browser and the cloud. How did this bold IVR work?

    In the mouth madness reptiloid

    Answer: call center. Nuff said.

    Speaking dryly, Paul 2.0 is an IVR running in our cloud. Therefore, all reptiloid logic should be spelled out in a JS script, right? Yes, but no.

    The second version of Pavel is synchronized with the client’s browser: Pavel shows rebuses on the site, and he hears your answers on the phone, depending on which the pictures are replaced and the result is displayed. At first glance, this interaction could be implemented using our HTTP API :

    • first, the browser would launch the script using the StartScenarios method . In the response, the method gives the parameters media_session_access_url and media_session_access_secure_url which contain the URLs for HTTP and HTTPS, respectively;
    • the running script could be communicated using the obtained URLs;
    • the script would tell the browser which pictures to use and update the score using the httpRequestAsync method .

    But how to "catch" a custom browser? After all, in the httpRequestAsync, you must pass an unambiguous URL. And yes, pictures - they also need to be stored somewhere.

    Therefore, in addition to the cloud-based JS script, we used our backend on express.js paired with : when the visitor entered the number, the browser gave this number to the backend via http, after which the http session turned into a session on web sockets. As a result, the script constantly communicated with the backend via http, and already the backend used web sockets to quickly transfer images and points to the browser.

    In part of the web sockets, the backend looked like this:
    'use strict';
    var app =express();
    var http =require('http');
    var server =http.createServer(app);
    var io =require('')(http).listen(server);
    var session =require('express-session')({
    var sharedsession =require('');
    var sockets = {};
    io.on('connection', function (socket) {
    if (socket[socket.handshake.session.caller_id] ===undefined&&
    socket.handshake.session.caller_id!==undefined) {
    sockets[socket.handshake.session.caller_id] = socket
    app.use((req, res, next) => {
    let allowedOrigins = [
    // allowed hosts
    let origin =req.headers.origin;
    if (allowedOrigins.indexOf(origin) >-1) {
    res.setHeader('Access-Control-Allow-Origin', origin);
    res.header('Access-Control-Allow-Headers', 'Origin, X-Requested-With, Content-Type, Accept');
    res.header('Access-Control-Allow-Methods', 'GET,PUT,POST,DELETE,PATCH,OPTIONS');
    res.header('Access-Control-Allow-Credentials', true);
    if (req.method==='OPTIONS') {
    } else {
    view rawpavel-backend.js hosted with ❤ by GitHub

    But still, most of the logic was stored in the script. Consider a reptiloid from this side ...

    Go on the script

    Answer: machine learning / machine learning. Taken from the Instagram of Arnie himself .

    From the obvious: be sure to connect the ASR recognition module .


    From interesting:

    • there was an questions object in the script with all the answers and the names of the .jpg files ;
      each time the script was run, the questions were mixed using the shuffle helper function :

      show code
      functionshuffle(a) {
      var j, x, i;
      for (i =a.length-1; i >0; i--) {
      j =Math.floor(Math.random() * (i +1));
      x = a[i];
      a[i] = a[j];
      a[j] = x;
      return a;
      view rawshuffle.js hosted with ❤ by GitHub
    • A “top-level” handler for an incoming call ( CallAlerting ) checks the phone for uniqueness, and also contains handlers for connecting and ending a call. Just inside onCallConnected there is an appeal to the backend (read, to socketio):

      show code
      VoxEngine.addEventListener(AppEvents.CallAlerting, async (e) => {
      call.addEventListener(CallEvents.Connected, onCallConnected);
      call.addEventListener(CallEvents.Disconnected, onCallDisconnected);
      // ...
      functiononCallConnected(e) {
      call.say("Мир тебе, землянин! Конкурс прост: я показываю в браузере изображения, в них зашифрованы слова или словосочетания, имеющие отношение к конференции и <say-as stress='2'>АйТи</say-as>."+
      "А ты пытаешься их отгадать, разговаривая со мной по телефону. Всего пять вопросов и по три попытки отгадать правильный ответ. Готов??? Понеслась!",
      call.addEventListener(CallEvents.PlaybackFinished, startGame);
      call.addEventListener(CallEvents.RecordStarted, async (rec) => {
      let res =awaitNet.httpRequestAsync(ws +'/urlResult?caller_id='+encodeURIComponent(caller_id) +'&url='+
    • just above startGame is visible , in it just the questions are mixed up, chopped up and sent to the backend along with the image indices:

      show code
      asyncfunctionstartGame() {
      questions =questions.slice(0, 5);
      let res =awaitNet.httpRequestAsync(ws +'/voxResult?caller_id='+encodeURIComponent(caller_id) +'&data='+
      // qIndex и attempts на старте = 0
      data: questions[qIndex].pics[attempts],
      points: points
      try {
      res =JSON.parse(res.text);
      } catch (err) {
      if (res.result===true) {
      Logger.write("===--- The Game has started! ---===");
      startASR(); // запуск распознавания
      wireCall(); // отправка медиапотока в ASR
      view rawstartGame.js hosted with ❤ by GitHub
    • startASR creates an ASR instance and specifies the preferred recognition dictionary. When the player pronounces the answer, the function stops the ASR and starts processing what they hear - onRecognitionResult ;
    • onRecognitionResult removes the excess from the response:

      let rr = e[0].replace("это ", "").replace("вероятно ", "").replace("может быть ", "").replace("может это ", "");
      rr = rr.replace(/ /g, '');

      And then he starts counting attempts, points, and also voiced comments along the way:

      show code
      let found = questions[qIndex].answers.some(r=>rr.indexOf(r) >=0);
      Logger.write("FOUND: "+ found);
      if (found) {
      if (attempts ==0) {
      points +=5;
      call.say("<say-as stress='3'>Крутотенюшка</say-as>! Держи пятюню!", Language.RU_RUSSIAN_MALE);
      } elseif (attempts ==1) {
      points +=3;
      call.say("Нормально! Засчитываю три балла.", Language.RU_RUSSIAN_MALE);
      } elseif (attempts ==2) {
      points +=1;
      call.say("Ну такое… всего лишь один балл. Погнали дальше.", Language.RU_RUSSIAN_MALE);

      Also, the function increments variables with attempts and a question number to switch to the next question or end the game;
    • The final gameFinished function gives backend a sum of points, if a person has won a promotional code - this can be seen in the browser and heard on the phone, since Pavlik voiced winnings; after that hangup is done .

    The overall listing of the script approaches 300 lines, the most voluminous piece is the processing of the recognition result, onRecognitionResult .

    Talking fossil

    Answer: Firefox. We have everything.

    Although Paul and the dinosaur, but still keeps up with the times: it develops from year to year and still loves to joke. We hope you appreciate the second version of our reptiloid and "live", and from the point of view of implementation. Share your views in the comments, be healthy and remember - Paul loves you!

    Also popular now: