Advantech May 24, 2019 at 09:21

How machines communicate - MQTT protocol

In a previous article, we looked at the Modbus protocol , which is the de facto industry standard for M2M interaction. Developed back in 1979, it has a number of significant drawbacks that MQTT solves.

The MQTT protocol is quite young (standardized only in 2016), but has already managed to be widely used in industry and IoT. It was specially designed to be as compact as possible, for unstable Internet channels and low-power devices, and allows you to guaranteed delivery of messages in the event of packet loss and disconnection.

Key Features of the MQTT Protocol:

Compact and lightweight - minimal data transfer overhead to save traffic.
Resistance to losses - guaranteed delivery in the conditions of unstable network connections.
Asynchronous - allows you to serve a large number of devices, and does not depend on network delays.
QoS support - the ability to control message priority and guarantee message delivery to the recipient.
Dynamic configuration - does not require prior coordination of fields and data formats, can be configured on the fly.
Works for NAT - clients can be behind NAT, only the server (broker) must have a real IP. Allows you to do without VPN and port forwarding.
Convenient addressing - data fields have text names that are understandable to humans. No need to remember digital addresses and bit offsets.

In the article, we will compare MQTT and Modbus, analyze the protocol structure, basic concepts, and try using the cloud MQTT broker as an example in an unstable Internet connection.

MQTT Protocol History

MQTT was developed by IBM in 1999, and was initially used internally for its solutions.

In November 2011, IBM and Eurotech announced their participation in the Eclipse M2M working group and the transfer of the MQTT code to the Eclipse Paho project.

In 2013, the OASIS (Organization for the Advancement of Structured Information Standards) consortium began the process of standardizing the MQTT protocol. Up to this point, the protocol specification has been published under a free license, and companies such as Eurotech (formerly known as Arcom) already use the protocol in their products.

In October 2014, OASIS published the first official MQTT protocol standard.

In 2016, the protocol was standardized by the International Organization for Standardization ISO and received the number ISO / IEC 20922.

Since 2014, interest in the protocol begins to grow rapidly and, judging by the Google Trends schedule, today exceeds interest in Modbus.

Google Trends Benchmark

Basic concepts

MQTT has a client-server architecture. Messaging takes place through a central server called a broker. Under normal conditions, clients cannot communicate directly with each other, and all data exchange occurs through a broker.

Clients can act as data providers (Publisher) and as recipients of data (Subscriber). In a Russian translation, these terms are often translated as a publisher and a subscriber, but to avoid confusion, we will use only the original terminology.

In the MQTT protocol, clients communicate with each other through a central node.

At the application level, the protocol runs on top of TCP / IP and can easily connect remote objects directly over the Internet, without the need for VPN tunnels. It is enough for the broker to have a real IP address and all clients can connect to it. In this case, clients may be located behind NAT. Since clients initiate the connection in the MQTT protocol, port forwarding is not required to establish a connection, while in Modbus / TCP the server initiates a connection (master), which requires direct network accessibility.

The standard MQTT broker port for incoming TCP connections is 1883 . When using a secure SSL connection, port 8883 is used .

Broker

A broker is the central MQTT hub for customer interaction. Data exchange between clients occurs only through a broker. The broker can be server software or a controller. His tasks include receiving data from customers, processing and storing data, delivering data to customers, and monitoring message delivery.

Publisher / Subscriber

To understand the difference between Publisher and Subscriber, let's take a simple example: a humidity sensor measures the humidity in a room, and if it drops below a certain level, the humidifier turns on.

In this case, the humidity sensor acts as a Publisher : its task is only to publish data to the broker. The humidifier acts as a Subscriber : it subscribes to updates of humidity data and receives current data from the broker, while the humidifier can decide at what point to turn on the humidification.

In this scheme, MQTT clients, that is, the sensor and the humidifier, are not aware of each other's existence, and do not interact directly. The broker can receive data from various sources, manipulate them, for example, calculate the average value from several sensors, and return the processed data to the subscriber.

Publisher sends data to the broker, Subscriber subscribes to updates of this data.

At the same time, the MQTT protocol asynchronism provides that the sensor and humidifier can be online at different times, lose packets, and be inaccessible. The broker will take care of storing the last data received from the sensor in memory and ensure their delivery to the humidifier.

Topic

MQTT uses topics to identify entities, in the Russian translation they are also called channels. Topics consist of UTF8 characters, and have a tree structure similar to a UNIX file system. This is a convenient mechanism for naming entities in a human-readable form.

An example of topics in MQTT

# Датчик температуры на кухне
home/kitchen/temperature
# Датчик температуры в спальне
home/sleeping-room/temperature
# Датчик освещенности на улице 
home/outdoor/light

This approach allows you to visually see what data is transmitted, and it is convenient to develop and debug the code without having to memorize the digital address of the data placement, as is done in Modbus.

Topics also include wildcard syntax, familiar to those who have worked with the UNIX file system. Wildcard can be single-level and multi-level.

A single-level wildcard is indicated by a + .

For example, to receive data from temperature sensors in all rooms in the house, the subscriber needs to subscribe to such a topic:

home/+/temperature

As a result, he will subscribe to receive data from such sensors:

home/kitchen/temperature
home/sleeping-room/temperature
home/living-room/temperature
home/outdoor/temperature

A multi-level wildcard is indicated by the symbol " # ".
An example of obtaining data from all sensors in all rooms in the house:

home/#

Subscribing to such a topic will allow you to receive data from such sensors:

home/kitchen/temperature
home/kitchen/humidity
home/kitchen/light
home/sleeping-room/temperature
home/sleeping-room/humidity
home/sleeping-room/light
....

Customer identification

For access control, MQTT provides client authentication, unlike the Modbus protocol, which does not have such a function. The following fields are used for access control:

ClientId - (required field) unique identifier of the client. Must be unique to each customer. The current version of the MQTT 3.1.1 standard allows you to use the empty ClientId field if you do not need to save the connection status.

Username - (optional field) login for authentication, in UTF-8 format. May not be unique. For example, a group of clients can log in with the same username / password.

Password- (optional field) can be sent only together with the Username field, while the Username can be transmitted without the Password field. Maximum 65535 bytes. It is important to know that the name and password are transmitted in clear text, therefore, if data is transmitted over public networks, you must use SSL to encrypt the connection.

Package structure

As mentioned above, in the MQTT protocol, clients always initiate a connection, regardless of whether they are recipients (Subscriber) or suppliers (Publisher) of data. We will analyze the packet with the connection that was intercepted using the Wireshark program.

Packet with the MQTT option transmitted over an unencrypted channel

The TCP header shows that the packet was transmitted on port 1883, that is, encryption is not used, which means that all data is available in clear form, including login and password.

Headline

The message type is Connect (command 0x0001), establishing a connection with the broker. Main teams: Connect, Disconnect, Publish, Subscribe, Unsubscribe. There are also acknowledgment commands, keep alive, etc.

Flag DUP - means that the message is retransmitted, it is used only in message types PUBLISH, SUBSCRIBE, UNSUBSCRIBE, PUBREL, for cases when the broker did not receive confirmation of the receipt of the previous message.
QoS level - flag of Quality of Service. We will discuss this topic in more detail later.
Retain - data published with the retain flag is stored on the broker. Upon subsequent subscription to this topic, the broker will immediately send a message with this flag. Used only in messages of type Publish.

Practical use

Now, having familiarized ourselves with the theory, let's try to work with MQTT in practice. For this, we will use the open Mosquitto program , which can work both in client mode and in server (broker) mode. It works on Windows, macOS, Linux. The program is very convenient for debugging and studying the MQTT protocol, while it is also widely used in industrial operation. We will use it as a client to send and receive data from a remote cloud broker.

Many cloud providers provide MQTT broker services, such as Microsoft Azure IoT Hub , Amazon AWS IoT , and others. In this example, we will use the Cloudmqtt.com service, since it has the simplest registration, and a free tariff is enough for training.

After registration, details for connecting to a broker are available in your account. Since we connect to the server via public Internet networks, it is reasonable to use an SSL port to encrypt traffic.

Details of access to the MQTT broker in the personal account of the cloud provider The

flexibility of the MQTT protocol allows the client to transfer data that is not previously defined on the broker. That is, there is no need to pre-create the necessary topics in which Publisher can write data. Using the data received from your personal account, we will try to manually compose a request for publishing data to the habr / test / random topic and reading from it.

mosquitto_sub - subscriber client utility
mosquitto_pub - publisher client utility

First, connect to the broker as a subscriber, and subscribe to receive data from the
habr / test / random topic .

mosquitto_sub -d --capath /etc/ssl/certs/ --url mqtts://hwjspxxt:7oYugN7Fa5Aa@postman.cloudmqtt.com:27529/habr/test/random
Client mosq/zEPZz0glUiR4aEipZA sending CONNECT
Client mosq/zEPZz0glUiR4aEipZA received CONNACK (0)
Client mosq/zEPZz0glUiR4aEipZA sending SUBSCRIBE (Mid: 1, Topic: habr/test/random, QoS: 0, Options: 0x00)
Client mosq/zEPZz0glUiR4aEipZA received SUBACK

It can be seen that the connection was successful, and we subscribed to the habr / test / random topic , and now we are waiting for data in this topic from the broker.

Since an SSL connection is used, to verify the certificate, you must specify the path by which the program will look for root encryption certificates. Since the service in our example uses a certificate issued by a trusted certification authority, we indicate the path to the system store for root certificates: --capath / etc / ssl / certs /

In the case of a self-signed certificate, you must specify the path to the desired CA. It is also important to consider the difference in the URI format for SSL connections - mqtt s : //, and non-encrypted connections - mqtt: //. In the event of a certificate verification error, the program terminates without an error message. For more detailed output, you can use the --debug switch

Now let's try to publish the data in the topic without interrupting the first program.

mosquitto_pub -d --capath /etc/ssl/certs/  --url mqtt://hwjspxxt:7oYugN7Fa5Aa@postman.cloudmqtt.com:27529/habr/test/random -m "Привет хабр!"
Client mosq/sWjh9gf8DRASrRZjk6 sending CONNECT
Client mosq/sWjh9gf8DRASrRZjk6 received CONNACK (0)
Client mosq/sWjh9gf8DRASrRZjk6 sending PUBLISH (d0, q0, r0, m1, 'habr/test/random', ... (22 bytes))
Client mosq/sWjh9gf8DRASrRZjk6 sending DISCONNECT

It can be seen that the data was successfully received by the server and published in the desired topic. At the same time, in the first window in which the mosquitto_sub program is running, we see how the message was received, while even Unicode works, you can see the message in Russian.

Client mosq/zEPZz0glUiR4aEipZA received PUBLISH (d0, q0, r0, m0, 'habr/test/random', ... (22 bytes))
Привет хабр!

QoS and delivery guarantee

However, sending a message in real time will not surprise anyone, because the same can be done even with the banal utility nc . Therefore, we will try to simulate an unstable connection between the subscriber and the sender. Imagine that both clients work through GPRS, with a huge packet loss, and even a successful TCP connection is rare, and you need to ensure that the subscriber is guaranteed to receive a sender message. In this case, QoS options come to the rescue.

By default, the QoS flag is set to 0 for messages., which means “Fire and forget”: Publisher publishes a message on the broker, but does not require that the message is guaranteed to be delivered to the subscriber. This is suitable for data whose loss is not critical, for example, for regular measurements of humidity or temperature.

QoS 1: At least once - at least one . This flag means that until Publisher receives delivery confirmation to the subscriber, this publication will be sent to the broker, and then to the subscriber. Thus, the subscriber must receive this message at least once.

QoS 2: Exactly once - guaranteed one. The QoS flag, which provides the highest guarantee of message delivery through the use of additional procedures for confirmation and completion of publication (PUBREC, PUBREL, PUBCOMP). Applicable for situations where it is necessary to exclude any loss and duplication of data from sensors. For example, when an alarm is triggered from a received message, an emergency call is made.

To simulate poor communication, disable both clients and try to send a message with the highest QoS priority, and also add the Retain option so that the sent message is saved on the broker.

mosquitto_pub --retain --qos 2 -d --capath /etc/ssl/certs/  --url mqtt://hwjspxxt:7oYugN7Fa5Aa@postman.cloudmqtt.com:27529/habr/test/random -m "Очень важный привет!" 
Client mosq/Xwhua3GAyyY9mMd05V sending CONNECT
Client mosq/Xwhua3GAyyY9mMd05V received CONNACK (0)
Client mosq/Xwhua3GAyyY9mMd05V sending PUBLISH (d0, q2, r1, m1, 'habr/test/random', ... (37 bytes))
Client mosq/Xwhua3GAyyY9mMd05V received PUBREC (Mid: 1)
Client mosq/Xwhua3GAyyY9mMd05V sending PUBREL (m1)
Client mosq/Xwhua3GAyyY9mMd05V received PUBCOMP (Mid: 1, RC:0)
Client mosq/Xwhua3GAyyY9mMd05V sending DISCONNECT

Now, after some time, our recipient was finally able to establish a connection to the Internet and connected to the broker:

mosquitto_sub  -d --capath /etc/ssl/certs/ -d --url mqtts://hwjspxxt:7oYugN7Fa5Aa@postman.cloudmqtt.com:27529/habr/test/random
Client mosq/VAzcLVMB1MiWhYxoJS sending CONNECT
Client mosq/VAzcLVMB1MiWhYxoJS received CONNACK (0)
Client mosq/VAzcLVMB1MiWhYxoJS sending SUBSCRIBE (Mid: 1, Topic: habr/test/random, QoS: 0, Options: 0x00)
Client mosq/VAzcLVMB1MiWhYxoJS received SUBACK
Subscribed (mid: 1): 0
Client mosq/r6UwPnDvx8aNInpPF6 received PUBLISH (d0, q0, r1, m0, 'habr/test/random', ... (37 bytes))
Очень важный привет!

Conclusion

MQTT is a modern, advanced protocol, devoid of many of the drawbacks of its predecessors. Its flexibility allows you to add client devices without setting up a broker, which significantly saves time. The entry threshold for understanding and configuring the protocol is quite low, and the presence of libraries for many programming languages allows you to choose any technology stack for development. The message delivery guarantee significantly distinguishes MQTT from its predecessors, and allows you not to waste time unnecessarily developing your own integrity control mechanisms at the network level.

Tags: