Microsoft Edge and RTC: a history of pain

    imageVoxImplant cloud telephony can receive calls and make calls to different sources. Cell phones, SIP, mobile applications, web pages. You can call from your mobile to a web page that looks bewitching. If everything is clear with cellular, then the ability to talk to a browser requires something other than HTML and JavaScript. Previously, “something” was Flash. And we still know how to use it as a fallback option. But over the past few years, popular browsers are not calling at all through Flash, but through HTML5 WebRTC technology. Which until recently was introduced in Chrome and Firefox. But everything flows, everything changes, and WebRTC support appeared in the Microsoft Edge beta. Nearly. Microsoft has traditionally gone its own way and made an “alternative” implementation called ORTC.


    What kind of beast is WebRTC?


    What is WebRTC? This is a JavaScript API that allows you to do four things:

    1. Capture the video stream from the camera and the audio stream from the microphone.
    2. Play video and audio (via HTML5 video and HTML5 audio).
    3. Establish a UDP (or TCP, if everything is bad) connection between two browsers, either through an intermediate server or directly, including nat traversal.
    4. Stream video, audio and user data over the established connection.

    In fact, it replaces Flash for working with video / sound and allows you to do hangouts, skype for web and other peer-to-peer video and voice conferences. Without flash and with a browser confirmation “give access to your camera and microphone”.

    Details of the Devil


    The biggest challenge when using WebRTC is setting up a connection. The API is “sharpened” for the nat penetration scenario, when both users have IP addresses like “192.168 ...” and you need to juggle UDP packets to trick intermediate NAT servers and start sending data. There is no “connect” method, even if we want to establish a connection with a server that has a guaranteed public IP address. Everything will have to be done manually.

    The second difficult point is the codecs. Capturing and compressing video, transferring it over the network and playing back are interconnected processes with many nuances. When calling between two browsers, especially different ones, you need to agree on a codec, analyze the network bandwidth, change the bitrate, video resolution. And video and sound can be turned on and off. And you can interfere with the process and force the bitrate.

    And WebRTC is pretty much tied to SDP - an ancient text protocol used in voip telephony and compatible with SIP. And if you need to intervene in the communication process, for example, set a fixed bitrate, you will need to parse and change this text.

    There is no WebRTC in Edge!


    Microsoft found the WebRTC API too complex for JavaScript developers and implemented an alternative, Object Real-Time Communications. In terms of protocols, ORTC works in much the same way as WebRTC. But the JavaScript API accessible from the browser was written from scratch in an object-oriented style. SDP no longer sticks out, you do not need to parse text, everything is controlled through objects and their fields.

    ORTC was added to the WebRTC standard and other web browsers began to implement it, there is already a partial implementation in Firefox. All this sounds interesting and promising until we find out that ...

    ORTC not yet implemented


    Edge contains an incomplete implementation of the one-year-old version of ORTC. And at the moment there is not a single “complete” implementation of ORTC. Unlike WebRTC, which has been available for many years in Chrome and Firefox.

    By the way, there are no working polyfiles (WebRTC API emulators over ORTC API in the browser). That is, they are, but they are not ready for commercial exploitation and further demos do not work. And this is exactly what we developed. Because making a polyfile is much easier than rewriting a working and debugged SDK to support two fundamentally different APIs.

    Edge ORTC is not fully implemented


    It was the most painful thing. Available in beta now, the ORTC implementation seems to have been created for Skype for Web. Good documentation allows you to quickly collect voice or video calls from Edge to Edge. But if you call Firefox or your own server, nuances begin to pop up.

    The ORTC standard has Trickle ICE support that speeds up connectivity. Edge even has methods in place, but it’s nowhere written that they cannot be used for such a scenario. Many things are not compatible with Chrome and Firefox. For example, authorization for ICE or codecs with the same name, but with a different payload type.

    There are no fakebacks. If we take a step to the right or left, for example, to create a receiver without data and transfer it to connect, then we will get only an error code and nothing more. Until recently, these codes did not even have a description; the only way to find out was to ask Microsoft. Recently, a brief description of the return codes has been laid out and life has become a little easier, but the API still assumes a “only correct” use case and severely punishes any attempt to move away from it.

    And there are codecs!


    Codecs for video and sound are a separate pain. Traditionally, WebRTC uses H.264 and VP8 for video, Opus and g.711 for sound. Edge offers only a minimum of codecs for Skype: H.264UC for video (inherited from Microsoft Lync), g.711 and, until recently, its own implementation of Opus for sound. Good news - recently added the "regular" Opus and promise to add support for VP9. Bad news - VP9 has not been added yet. So the sound between different browsers can already be transmitted, but c the video will have to wait a bit.

    A light in the end of a tunnel


    In fact, everything is not so bad and our developers pretty quickly made the SDK for Edge, which we plan to offer you together with the release of the corresponding version of the browser. The good news is that WebRTC (or is it already ORTC?) Is already being developed and supported by almost all browsers, with the exception of Safari. Rumor has it that Apple is hiring developers to work on WebRTC, and the first implementation has appeared in WebKit nightly. It's time to abandon Flash, not only for playing videos, but also for calls.

    Also popular now: