SIP customer interaction. Part 1

A month ago, I began my acquaintance with IP-telephony, namely with Lync and Asterisk. And I noticed the following picture: there are a lot of interesting articles on the practical side of the issue (how and what to do) and very little attention is paid to the theory (at the end of the article there are links). If you want to deal with SIP, you can either read RFC 3261 or one of “these thick books”. This, of course, is useful, but many want to learn some sort of squeeze at the beginning, and only then rush into the pool with their heads. This article is just for such people.
In order not to overload the reader, I decided to split the article into two parts. In the first part, we will consider the operation of the SIP protocol in the interaction of two clients.
Simple customer interaction
Customer interaction within the framework of SIP is most often carried out in the form of a dialogue.
A dialog is an equal interaction between two User Agent (UA) in the form of a sequence of SIP messages between them. At the same time, there are queries that do not form dialogs. However, first things first.
The following is an example of a simple interaction between two devices with SIP support:

Peter wants to start messaging with Ivan, for this he sends an INVITE message with data about the type of session (simple, multimedia, etc.). Messages have the following format: start line, one or more header fields, an empty line indicating the end of the header fields, and an optional message body.

The start line contains the method, Request-URI, and SIP version (actual - 2.0). Request-URI is the SIP address of the resource to which the request is sent.

The header fields have the following format: <Title>: <Value> <Line feed>
The first line starts with the Via header. Each SIP device creating or forwarding a message adds its address to the Via field (I plan to show how this happens in the next part of the article). Typically, an address is a host name that can be resolved using a DNS query. Via field contains SIP version, “/” sign, space, transport protocol (UDP, TCP, TLS, SCTP), colon, port number and branch - transaction identifier. Responses to this request will contain the same transaction number.

Most often, the value of branch begins with “z9hG4bK”. This means that the request was generated by a client that supports RFC 3261 and the parameter is unique for each transaction of this client.
The next field, Max-Forwards, contains a relatively large integer. Each SIP server that forwards the message decreases this number by one. This field provides a simple loop detection mechanism.
Next come the From and To fields, which describe the sender and receiver of the request. It is important that SIP requests are routed based on the Request-URI specified in the start line (see above). This is because the From and To fields can be changed during forwarding. If a display name is used (for example, Ivan Ivanov), then the SIP URI is placed inside a pair of angle brackets. The tag parameter in the From field is generated by the sending side. In turn, the receiving party will place its tag in the To field.
Call-ID field - call identifier. The set of tags from the From and To and Call-ID fields uniquely identify this dialog. This is necessary, since several dialogues can go between clients at once.
The next field, Cseq, contains the sequence number of the request and the name of the method. In this case, INVTITE. The number increases with each new request.
The Via, Max-Forwards, To, From, Call-ID, and CSeq fields make up the minimum required set of SIP message header fields.

For an INVITE message, a Contact header field is also required, which contains the SIP URI related to the communication device of the sending side. This field is used so that of all the devices that Peter can use at the same time, the answer was sent to this particular device. Pay attention to the values of the From and Contact fields. The first time I did not notice the difference:

An optional Subject field is present in the message, i.e. the subject of the message. Some SIP clients may display the value of this field. For routing and identification of the dialogue field is not used and can be arbitrary.
The Content-Type and Content-Length fields are responsible for the description of the message body. In this case, the Session Description Protocol (SDP) will be used. The message size is calculated taking into account the line feed characters:

A detailed description of the operation of the SDP protocol deserves a separate article, so only a brief decryption is given below:

In response to INVITE, Ivan's SIP client sends two messages: 180 Ringing and 200 OK. The first reports that on the side of Ivan, the SIP client gives a ringtone, the second confirms the installation of a dialogue. We will deal with each of them.
This is what the 180 Ringing message will look like: The

text that has not changed compared to the INVITE message is highlighted in faint.
Notice the To and From header fields. Despite the fact that this message is coming from Ivan, the field values remain the same as they were in the initial request (from Peter to Ivan). This is because these fields determine the direction of the request, not the message.
The Via line also migrated from the original request, the received parameter is added to the end of the line. This parameter contains the IP address from which the request came. This is usually an address that can be obtained by resolving the URI contained in Via.
As I promised, a tag identifying the dialog has been added to the To field. All subsequent messages within the dialog will contain unchanged tag values.
Finally, the Contact field contains the current address of Ivan.
This is what the 200 OK message looks like, which was sent by Ivan's SIP client:

I think that the meaning of all fields related to the SIP protocol is now clear.
In response to 200 OK, Peter's client sends a confirmation:

This message confirms that Peter's client successfully received a response from Ivan's client. Both clients agreed on the parameters of the copper session, which will be implemented using the RTP protocol.
Note that the CSeq sequence number is still one, but the ACK is already used as the method. The Branch parameter in the Via field contains the new transaction identifier, since the ACK sent in response to 200 OK is considered a new transaction.
Now let's look at how the media session ends. Peter's client sends a BYE request to end the session:

Having received a request to end the session, Ivan's client sends a confirmation:

The session is completed.
We considered a simple version of the SIP protocol. Please note that at different points in time, the clients of Ivan and Peter acted either as a server or as a client, therefore in all SIP-clients both the server (User Agent Server or UAS) and the client part (User Agent Client or UAC).
In the next article, I plan to consider the interaction of SIP clients using a proxy server and the registration of clients on a proxy server.
What to read on the topic
1. RFC 3261. tools.ietf.org/html/rfc3261
2. Everything you wanted to know about SIP (three parts). Andrey Pogrebennik. samag.ru/archive/article/1831
3. SIP: Understanding the Session Initiation Protocol. Alan B. Johnston. www.amazon.com/SIP-Understanding-Initiation-Protocol-Telecommunications/dp/1607839954/ref=sr_1_1?ie=UTF8&qid=1375104428&sr=8-1&keywords=sip#
4. SIP protocol. Goldstein B.S., Zarubin A.A., Samorezov V.V. www.vef-kvant.ru/sip.htm