WebRTC video chat development between iOS, Android and browser
In a previous article, we described the principles of developing video chat between a browser and an Android device. Now let's try to complicate the task and implement a three-way video chat on the following platforms: Google Chrome on the desktop, Android application on the tablet, and iOS application for the Apple iPhone.
Recall two basic principles of constructing a video chat:
- Each connected user can send (publish) his video stream to the server.
- Users know the names of the video streams of each other and have the ability to play them.
Thus, in a video chat of three participants, each of the participants will have to play two video streams.
A sequence of nine actions appears as shown below. First, participants publish their video streams, then play each other's streams.
Publish streams
1. Alice: session.createStream({name:"stream_alice"}).publish();
2. Boris: session.createStream({name:"stream_boris"}).publish();
3. Anna: session.createStream({name:"stream_anna"}).publish();
Streams play
4. Alice: session.createStream({name:"stream_anna"}).play();
5. Alice: session.createStream({name:"stream_boris"}).play();
6. Boris: session.createStream({name:"stream_alice"}).play();
7. Borissession.createStream({name:"stream_anna"}).play();
8. Anna: session.createStream({name:"stream_alice"}).play();
9. Anna: session.createStream({name:"stream_boris"}).play();
In this case, the developer is required to organize the transfer of the names and statuses of the video streams to the participants of the video chat. This can be done on any suitable technology, for example PHP, websockets, Node.js, etc., since the transmission of the name of the video stream is no different from the transmission of a regular text message from one user to another.
A stream can have three main statuses: PUBLISHING , PLAYING , STOPPED
For successful playback, the requested stream must be in the active PUBLISHING status .
This diagram shows how it is possible to implement the exchange of names and statuses of streams, in a simplified version, when Alice demonstrates her video stream to Boris and Anna. This procedure takes 8 steps , and can be called signaling , because As a result of this procedure, coordination is carried out:
- Alice sends the video stream to the WCS server.
- Alice receives confirmation from the server in the form of PUBLISHING status
- Alice sends Boris a message that her stream is ready to play.
- Alice sends Anna a message that her stream is ready to play.
- Anna plays the video stream from the WCS server.
- Anna receives confirmation of PLAYING status.
- Boris plays the video stream from the WCS server.
- Boris receives confirmation of the status of PLAYING.
As a result, an arbitrary interaction scenario can be realized: with connecting two or more users to the chat, with connecting just viewers, etc.
The rooms
Organizing the exchange of video stream names and their statuses is not so technically difficult, but it requires certain labor costs and working with the code.
On the other hand, for such a task some kind of universal solution suggests itself, which could help users quickly coordinate flows and get into chat. This solution is called the Room or Room API .
Indeed, if two or more users interact in the same context, then it looks like a room. Inside the room, users see each other's video streams, know who is in the room and can exchange messages, including private ones.
Thus, we have four objects that completely cover work with rooms:
- Room - room
- Stream - video stream
- Participant - participant
- Message - message
The Room API makes it possible to cross-platform use the above abstractions: Room , Stream , Participant and Message to implement the following functions:
Connection
- User can connect to the room.
- The user sets the name of the room when connected.
- If a room with the same name exists, then the user enters this room.
- If there is no room with this name, a new room is created and the user enters the newly created room.
- The user receives notifications about connections / disconnections of other participants.
- The user can get a list of participants.
Streaming
- The user can publish the video stream inside the room.
- The user receives notifications about the status of the video streams of other participants.
- The user can play the video stream inside the room.
Messages
- Users can exchange messages inside the room
- Users can share images or other content if it is packed in text format
- A message can be sent to one or more participants
Room API
For the Web platform, the rooms were implemented as a JavaScript module with the following basic functions:
1. We get a connection to the server.
var connection = Flashphoner.roomApi.connect({urlServer: "wss://host:8443", username: "Alice"});
2. We enter the room.
connection.join({name: "room1"});
3. Get the list of room participants.
var participants = room.getParticipants();
4. Send your video stream to the room.
room.publish({
display: document.getElementById("localDisplay"),
constraints: constraints,
record: false,
receiveVideo: false,
receiveAudio: false
});
5. Play the stream of the participant.
participant.getStreams()[0].play(document.getElementById(pDisplay))
6. We follow the participants of the room:
connection.on(ROOM_EVENT.JOINED, function(participant){...});
connection.on(ROOM_EVENT.LEFT, function(participant){...});
connection.on(ROOM_EVENT.PUBLISHED, function(participant){...});
JOINED - a new member
LEFT has joined the room - the participant has left
PUBLISHED - the participant has published his video stream
7. We receive messages from other participants.
connection.on(ROOM_EVENT.MESSAGE, function(participant){...});
8. We send a private message to a specific participant
participants[i].sendMessage(message);
9. Send a message to all participants immediately.
var participants = room.getParticipants();
for (var i = 0; i < participants.length; i++) {
participants[i].sendMessage(message);
}
}
Thus, the implementation of the rooms provides a simple exchange of messages and statuses of all participants connected to the room.
Room API Limitations
All the logic when working with rooms rests with the client. The server only controls the basic functionality of the room:
- notifications about connecting / disconnecting users in the room
- notifications about creating / disabling video streams by users inside the room
- routing messages to specified participants
Thus, authentication, access rights, roles of participants (moderator, viewer, presenter), and other logic should be implemented on the client and / or third-party back-end. The rooms only help to make a quick exchange of information about video streams and statuses.
Rooms for Web, Android, iOS
In each of the SDKs (Web, Android, iOS) for working with the server, there is an API for working with rooms.
Room Entry Examples:
Web SDK
connection.join({name: "room1"});
Android SDK
Room room = roomManager.join(roomOptions);
iOS SDK
room = [roomManager join:options];
Thus, three participants from three different platforms can enter the same room:
- Web browser
- Android mobile app
- IOS mobile app
Test applications for working with rooms
Below we give three test applications for working with rooms. Each of them is open source and can be compiled from source.
Each of the following applications allows three participants to exchange video streams and messages:
- Conference for web
- Conference for Android
- Conference for iOS
Conference for web
The code for this application is available for download here .
Two div elements can be found in HTML
<divid="participant1Display"class="display"></div>
<divid="participant2Display"class="display"></div>
These elements are responsible for displaying the video of participants.
div - the localDisplay element is responsible for displaying the video captured from the camera.
<divid="localDisplay"class="fp-localVideo"></div>
Using the Join / Leave buttons, you can enter and leave the room, respectively. Using Stop / Start, you can send video to a room or stop broadcasting. Login field must be unique, because identifies the participant.
The next interface element is text chat. This chat displays messages received or sent by other users, and also displays information about events occurring in the room.
And the last element is a link to the invitation. If the user logs in first, a new room is created. In this case, with the name roomName = room-fb41b7. If the second user specifies the same roomName, he will be taken to the same room. In the Conference for Web application, invitations are implemented by generating an entry link containing roomName as a parameter. In versions of the application for Android and iOS, the name of the room is indicated directly in the interface.
Thus, the test application Conference for Web implements all the functions of the rooms laid down above and allows several users to connect to the same room and exchange video streams and text messages.
The screenshot below shows how a room with three participants works. Three tabs of the Google Chrome browser were opened and a connection to the room was initiated on each of the tabs.
Conference for Android
The code for this application is available for download here .
In the application, you can enter the name of the room directly from the UI. Otherwise, the application is very similar to Conference for Web , which was described above and has the same interface elements:
- Leave / Join buttons to enter and exit the room.
- Two video windows for playing video participants.
- One video window for displaying video capture from the camera.
- Text chat with system information and messages from participants
Conference for iOS
The application code is available for download here .
The application interface for iOS is almost no different from the application for Android, accurate to the name of the buttons and controls.
As a result, we did a test and collected all three platforms in one room with the number 3119d6.
The Google Chrome browser is a hare that gets out of the hole:
Android 5.1.1 on an Asus tablet is a flower pot.
iOS 10.1.1 on the iPhone 6 is Benjamin's Ficus on the windowsill.
Below is a full screenshot from an iOS device:
Thus, we completed testing and review of all three applications built on the basis of the Room API, and we can proceed to the source code and assembly.
Building Conference for iOS application from source
In a previous article, we showed how to build an example video chat for android and explained how the video chat code works in a browser.
Here we show how to build a Conference for iOS example and describe the main pieces of the sample code.
The first thing you need to get a Mac-iron and install the latest Xcode .
The build procedure requires installing Cocoapods , downloading sample code, and an iOS SDK. We will collect in the terminal, and then open the project in Xcode.
1. Install Cocoapods
sudo gem install cocoapods
2. Download all the examples from github
git clone https://github.com/flashphoner/wcs-ios-sdk-samples.git
3. Download the iOS SDK
wget https://flashphoner.com/downloads/builds/flashphoner_client/wcs-ios-sdk/WCS-iOS-SDK-2.3.0.tar.gz
4. Unpack the archive
tar -xvzf WCS-iOS-SDK-2.3.0.tar.gz
5. Copy the FPWCSApi2.framework folder to the examples
cp -R FPWCSApi2.framework wcs-ios-sdk-samples
6. We start assembly.
./build_example.sh
If the assembly was successful, ** ARCHIVE SUCCEDED ** is displayed in the terminal
After the examples are collected from the console using Cocoapods , all dependencies are tightened and configured, and further examples can be built directly from Xcode .
7. Open WCSExample.xcworkspace in Xcode
8. Select the Generic iOS Device for assembly purposes. We start the assembly of the Conference example from the Product / Build menu and wait for the completion.
9. Connect the iPhone via USB and launch the assembled Conference application . Debugging information appears in the console.
10. The Conference application appears on the iPhone screen
Thus, we built the Conference mobile application for iOS from source and used the iOS SDK version 2.3.0 + Cocoapods for this assembly. As a result, we were able to run this application on the iPhone 6, connected via USB to the Mac on which this assembly was performed.
Code for Web, Android, iOS video chats
Above, we gave examples of applications for three platforms that use the Room API and exchange video streams inside the created room. Let's try to briefly list the main pieces of code that are responsible for the video chat in each of these three platforms:
Web | Android | iOS | |
Code | Javascript | Java | Objective-c |
Main code | conference.js | ConferenceActivity.java | ViewController.m |
Server connection |
|
|
|
Connection to the room |
|
|
|
Getting a list of participants |
|
|
|
Sending a video stream to a room |
|
|
|
Play the participant’s video stream |
|
|
|
We track the joining of new participants to the room |
|
|
|
Tracking room members leaving |
|
|
|
We monitor the publication of the video stream by the room participant |
|
|
|
We receive an incoming message from other participants |
|
|
|
We send a message to one of the participants |
|
|
|
Total we get 10 basic designs for each of the platforms. You can find each of the listed constructions in the corresponding file with the code.
All three Web, iOS, and Android applications were tested with the latest build of Web Call Server 5 - a server for video chat and low-latency broadcasts with support for WebRTC technology.
References
Web Call Server - server for WebRTC video chat
EC2 launch - launching a finished image on Amazon EC2
Install - installing the server on a VPS or Dedicated host
Cocoapods - dependency manager for assembly
Web sdk | Android SDK | iOS SDK | |
SDK | html | html | html |
Download SDK | tar.gz | aar | tar.gz |
Room chat application example | WCS demo | Google play | Ad hoc only |
Main sample code file | Conference.js | ConferenceActivity.java | ViewController.m |
Source code for all examples | github | github | github |
Description of the procedure for building examples from source code | html | html | html |
Conference Room Example Code Description | html | html |