Want to encrypt any TCP connection at all? You now have a NoiseSocket
Hi% username%!
Not everything in this world revolves around browsers and there are situations when TLS is redundant or not applicable at all. There is not always a need for certificates, very often there are enough ordinary public keys, take the same SSH.
And then there is IoT, where pushing TLS as a whole is generally not a task for the faint of heart. And the backend, which, I’m pretty sure, everyone after the balancer communicates with each other via normal HTTP. And P2P and more and more and more ...
Not so long ago, the Noise Protocol Framework specification appeared on the network. This is essentially a protocol designer for secure data transfer, which in a simple language describes the stage of handshake and what happens after it. The author is Trevor Perrin, a leading developer of the Signal messenger, and Noise itself is used in WhatsApp. So there was a great reason to take a closer look at this protocol framework.
We liked it so much for its simplicity and conciseness that we decided on its basis to spawn a whole new protocol for the network layer, which is not inferior to TLS in security, but even surpasses it in some ways. We presented it at DEF CON 25, where it was very warmly received. It's time to talk about him and us.
First, a little about the NoiseSocket core itself, namely
Noise protocol framework
In fact, any of the protocols described by Noise Framework is a sequence of transmitted public keys and Diffie-Hellman operations performed on them.
The main idea of the Noise Protocol Framework is that absolutely all actions during the handshake affect the state of the protocol, and therefore the resulting session keys. All DH, all additional data that is transmitted or taken into account during a handshake, is mixed with the general state using hashing and as a result forms common symmetric keys.
All this happens inside a simple system of states, which consists of three parts.
There is, by the way, a video in English, where David Wong quite easily talks about how what works there.
HandshakeState is responsible for processing tokens and messages.
SymmetricState generates symmetric keys from DH results and updates them with each new DH. Thus, immediately after the very first DH, the data that comes after it (static keys, payload) is already encrypted with some kind of symmetric key.
Another SymmetricState hashes the additional data, such as the keys themselves, optional prologue , protocol names, etc. All this allows the protocol to be holistic and protected from outside interference at all stages of data transfer.
CipherState is just a symmetric AEAD cipher + nonce (counter) that is initialized with a key, which increments with every call to the encryption function.
Protocols in Noise are described in a special language, which consists of patterns, messages, and tokens.
Consider, for example, one of the protocols, Noise_XX, which allows you to establish a secure connection by exchanging static keys of the server and client in the process : Noise_XX is a pattern . It describes the sequence of messages and their contents. (s, rs) means that the client and server are initialized with their static ( s ) key pairs. These are the ones that are generated once. r stands for remote. As we see there are three lines with arrows. One line - one message. The arrow means who sends to whom. If to the right, then the client to the server, otherwise the opposite.
Noise_XX(s, rs):
-> e
<- e, ee, s, es
-> s, se
Each line consists of tokens . These are one or two letter expressions separated by commas. Single-letter tokens are only e and s and mean ephemeral and static public keys, respectively. Ephemeral generated once per connection, static reusable.
In general, in Noise, all protocols begin with the transmission of an ephemeral key. Thus, Perfect Forward Secrecy is achieved. About the same thing was invented in TLS 1.3 when all non-ephemeral ciphersuites were canceled.
Two-letter tokens mean Diffie-Hellman between one of the client and server keys. They are, as you might guess, four types:
ee , es , se , ss. Depending on which keys DH is made between, it performs different functions. ee , for example, is needed to randomize the final key for the transport session, and DH involving static keys are responsible for mutual authentication.
As you can see, in pattern XX, the client’s static key is passed to the server, exactly the same as vice versa. Therefore, three messages are used here. There are patterns that assume the client has a static server key (for example, when he first made XX) and reduce the number of messages to two. In addition, it becomes possible to transmit encrypted data immediately in the first message, which is called 0-RTT and reduces the response time from the server.
A payload can be added to each handshake message. It can be top-level protocol settings, the same certificates, just digital signatures, in general, anything within 64k bytes. All Noise packages are limited to this size. This way parsing is simplified, the length is always placed in 2 bytes, it is easier to work with memory.
As a result of the handshake, in our hands, in fact, there are only two symmetric keys that are the result of all DH that happened earlier. One for sending messages, the second for receiving them. Everything, after that we can send encrypted packets, incrementing nonce every time after sending.
In addition to the Noise Protocol pattern, it is characterized by the algorithms that it uses in each case. The specification lists the algorithms for DH, AEAD, and the hash. More Noise doesn't need anything.
DH: Curve25519, Curve448,
AEAD: AES-GCM, ChachaPoly1305,
Hash: Blake2, SHA2
All primitives are very fast, no RSA and other brake junk. Although, of course, if you want, you can do it yourself, no one forbids.
Noise socket
We saw all this beauty and realized that it was an ideal candidate for the role of the next generation transport protocol. After all, here right out of the box there are all the necessary security features, performance, the ability to screw on their authentication mechanisms. And the predicted code size should allow you to create normal secure connections from any, even the smallest devices. And they began to think.
The first PoC I wrote on Go somewhere around the new 2017 year. Almost nothing added to the original Noise, only the length of the packets. I showed it to the guys, wrote on the Noise Mailing List and by the end of June we finally came to a common denominator that could be implemented on more platforms.
So what did we add to Noise in the end? After long conversations and dozens of options, there were essentially only three things left:
- Negotiation data
- Padding
- Processing rules
Negotiation data
This is just a set of bytes in which you can put anything you want. It is needed in order to coordinate algorithms and patterns between the client and server. In our version, it looks like this:
Only 6 bytes, but this is enough for the server to understand how to process any Noise messages that get to it.
Padding
I am very glad that he was in the final spec, otherwise everyone would have to invent it themselves. This is the alignment of packages, which allows you to hide their true size and prevent guessing the contents even without decryption. It is implemented as an additional 2 bytes of initially encrypted data, which indicate the actual packet size. All that is after is garbage to be thrown away.
Processing rules
These are some simple rules of how the server responds to the client if it receives its message or vice versa, does not understand it and wants to switch to another protocol.
Why?
In Virgil has a PKI and we really did not have an opportunity to establish secure connections directly with the public key, instead of using certificates, and then on top of everything else to validate again. And now we have made the NGINX module and we can service the entire backend through NoiseSocket, adding digital signatures of static keys to it.
You probably think that you need to change everything in order to switch to NoiseSocket? But no.
If you are writing on Go and you already have HTTP services, then you just need to replace the DialTLS method for clients and Listen for servers, everything else will think that it works on TLS. This is all thanks to the Go library , which we also implemented.
Of course, there is still a lot of work to do on the code and specification, but hell, we have an alternative to TLS!
Now you do not need to invent something of your own when you want to build a secure link between two nodes with only public keys on hand. Tor, i2p, bitcoin, where a node is often identified by a public key, can use NoiseSocket immediately without any additives.
SSH, VPN, all kinds of tunnels can add digital signatures of static keys and get a full-fledged secure link with minimal overhead without having to drag openssl to yourself, you can do with Libsodium or even Nacl.
The approximate size of the compiled crypto primitives of course depends on the architecture, but it gives us hope that we can fit in 30 or even 20 kilobytes with a minimal implementation of NoiseSocket. In devices where the private key is wired in hardware, this solution simply will not have any special alternatives.
Conclusion
I had high hopes for TLS 1.3, since there handshake roundtrips were reduced from 8-9 to three, as in Noise, and added 25519. But
First, we decided not to add the ability to work without certificates, just by keys, although this the offer was.
Secondly, ed25519 certificates are unclear when they will appear, but in Noise I can use 25519 signatures today.
In addition, not so long ago, one of Noise's patterns, IK (which is 0-RTT), received a formal proof of correctness from the authors of WireGuard VPN, which only strengthened our confidence in the correct choice.
I suggest that you familiarize yourself with the NoiseSocket specification and I’m sure that in some projects it will suit you more than TLS. Meanwhile, we are waiting for your comments.
References
Noise Socket
Github
Specification Noise Protocol Framework Specification