# Native https implementation using crypto ++ for I2P bootstrap

Each new I2P node at the first start should get an initial list of nodes from somewhere. To do this, there are special servers (reseed), whose addresses are hardcoded in the code. Previously, the download was carried out via http, but recently, reseeds began to switch to https. For the purple I2P to work successfully, the corresponding changes were also required. The crypto ++ cryptographic library used there does not support ssl. Instead of using an additional library like openssl, which actually duplicates cryptography, the option considered below was chosen.

Bootstrap is the only place in I2P that uses https.

On the other hand, the article will be interesting to those who are interested in understanding how ssl works and try it yourself.

Our goal is to obtain an i2pseeds.su3 file of about 100K in size from one of the reseated I2P nodes. This file is signed with a separate certificate independent of the host certificate, so certificate verification can be excluded. The relatively small length of the data obtained allows us not to implement compression and recovery mechanisms for broken connections.

Only TLS 1.2 and the TLS_RSA_WITH_AES_256_CBC_SHA256 cipher suite will be used . In other words, AES256 is used for data encryption in CBC mode, and RSA is used for key negotiation.

This choice is due to the fact that AES256-CBC is the most used encryption in I2P, and RSA to simplify the implementation of the protocol by reducing the number of messages required for key negotiation. In addition to RSA and AES, the following cryptographic functions from crypto ++ will also be required:

An RSA implementation based on PKCS v1.5 is used. The key length can be any and is determined by the certificate.

Absolutely all transmitted messages start with a 5-byte header, the first byte of which contains the message type, the next 2 bytes are the protocol version number (0x03, 0x03 for TSL 1.2) and then the length of the remaining part (content) of the message is 2 bytes in Big Endian, by defining message boundaries.

Thus, when receiving new data, you should first read 5 bytes of the header, and then how many bytes are contained in the length field.

There are 4 types of messages:

In our implementation, the encrypted data is as follows:

16 bytes IV for CBC, in TSL 1.2 each message has its own IV;

data length up to 64K is the length of the headers;

32 bytes MAC, calculated for the 13-byte header and data, the header consists of 8 byte serial numbers starting from zero, message type (0x17 or 0x16), version and data length. Everything at BigEndian. The key for the HMAC is also calculated during connection setup;

placeholder, so that the length of the encrypted data is a multiple of 16 bytes, the last byte contains the number of bytes of the placeholder without regard to it. If the message length is a multiple of 16 bytes, another 16 bytes will be added for the sake of this last byte with a length.

During the installation process, we must solve two problems:

In our case, the message sequence is as follows:

ClientHello -> (0x01)

<- ServerHello (0x02)

<- Certificate (0x0B)

<- ServerHelloDone (0x0E)

-> ClientKeyExchange (0x10)

-> ChangeCipherSpec

- > Finished (0x14)

<- ChangeCipherSpec

<- Finished (0x14)

where "->" means sending a generalization, and "<-" - receiving.

All messages except ChangeChiperSpec are a message of type 0x16 - connection establishment. The content of this type of message starts with its own 4-byte header, the first byte of which is a type of connection setup message, as indicated above, and 3 bytes of the length of the remaining message,

Consider these messages in detail.

The first message that we send to the server after a successful connection. Since we use one specific set of ciphers, in our case it will be constant. Like this:

This message tells the server that we support TLS 1.2, this is a new connection (the length of the session identifier is zero) and we support the only cipher suite - RSA with AES256. We also pass a set of 32 “random” bytes for generating keys. If these bytes are really random, then they should be remembered somewhere, because they will be needed in the future.

“Twin brother” ClientHello, except that the message type is 0x02 instead of 0x01, and a non-empty session identifier. From this message we need only 32 random bytes.

It can contain several certificates, first comes the length of the entire group of certificates, then each certificate has its own length. We are only interested in the first certificate and read the length 2 times. The certificate itself is a DER encoded X.509. From it we need the RSA public key.

It does not contain anything useful, but it is taken into account when calculating the hash for Finished.

At this point, we have enough information to generate and reconcile keys that occur in 3 stages: generating a random secret key, calculating a master key, expanding the master key to obtain encryption keys and checksum.

The random private key is 48 bytes, the first 2 of which are the version number (0x03, 0x03), and the remaining 46 are randomly generated. Further, these 48 bytes are encrypted with the RSA public key, and together with the length of the encrypted block are sent to the server. It should be noted that the length of the encrypted block will be equal to the length of the key, not 48 bytes. For example, for certificates with a 2048-bit key, this length will be 256, and the length of the transmitted data will be 258.

Dispatched immediately after ClientKeyExchange. Always the same:

This message is of type 0x14 and hash calculation for Finished is not involved.

For further calculation of the keys, we need a pseudo-random function that takes 4 parameters at the input: the secret key just sent to the server, a label in the form of a text string, a block of initial data and the desired length of the result.

In TLS 1.2, it is defined as follows:

PRF (secret, label, seed) = P_SHA256 (secret, label + seed);

P_SHA256 (secret, seed) = HMAC_SHA256 (secret, A (1) + seed) +

HMAC_SHA256 (secret, A (2) + seed) +

HMAC_SH256 (secret, A (3) + seed) + ...

where A is determined by induction

A (0) = seed,

A (i) = HMAC_SHA256 (secret, A (i -1)).

That is, at each step, we recalculate the checksum from the previous step, and then calculate the checksum from combining the result with the text string and the initial data, repeating this until the desired length is obtained.

Now the master key is calculated using the

PRF formula (secret, “master secret”, clientRandom + serverRadom, 48);

where clientRandom is 32 random bytes from ClientHello, and serverRandom is from ServerHello.

Then it should be expanded to a 128-byte block containing 4 32-byte keys in the following sequence: MAC key to send, MAC key to receive, encryption key to send, decryption key to receive.

The MAC key is not used by us to receive.

Key expansion is performed according to the formula

PRF (masterSecret, "key expansion", serverRandom + clientRadom, 128)

clientRadom and serverRadom are interchanged here.

At this point, we have everything we need to start exchanging data, but, unfortunately, we must send a Finished message containing the correct data, otherwise the server will disconnect.

If all the previous posts were pretty trivial, then Finshed is more complicated. Firstly, it is of type 0x16, but its contents are completely encrypted, while when calculating the checksum, 0x16 also appears, and not 0x17 as for other encrypted messages.

The message itself contains the first 12 bytes from

PRT (masterSecret, "client finished", hash, 12)

where hash is SHA256 from the following message sequence:

ClientHello, ServerHello, Certficate, ServerHelloDone, ClientKeyExchange. All messages are counted without a 5 byte header.

If the message is generated correctly, the server will respond with ChangeCipherSpec and Finished, otherwise an error message.

After that, we server is ready for data exchange and we send our HTTP request and get a response.

The approach considered in the article allows you to effectively work with https for applications that do not require its full implementation. Instead of third-party ssl implementations that draw their own cryptography, you can use the one already present in the project, as shown in the example of crypto ++, which reduces the number of dependencies, improves support and portability.

Implemented and used almost in i2pd - C ++ implementation of I2P

Bootstrap is the only place in I2P that uses https.

On the other hand, the article will be interesting to those who are interested in understanding how ssl works and try it yourself.

#### Reinvent the wheel

Our goal is to obtain an i2pseeds.su3 file of about 100K in size from one of the reseated I2P nodes. This file is signed with a separate certificate independent of the host certificate, so certificate verification can be excluded. The relatively small length of the data obtained allows us not to implement compression and recovery mechanisms for broken connections.

Only TLS 1.2 and the TLS_RSA_WITH_AES_256_CBC_SHA256 cipher suite will be used . In other words, AES256 is used for data encryption in CBC mode, and RSA is used for key negotiation.

This choice is due to the fact that AES256-CBC is the most used encryption in I2P, and RSA to simplify the implementation of the protocol by reducing the number of messages required for key negotiation. In addition to RSA and AES, the following cryptographic functions from crypto ++ will also be required:

- HMAC for calculating checksums of encrypted messages and pseudo-random functions. It should be noted that the standard implementation of HMAC is used, and not from I2P
- SHA256 hash for use with HMAC and for calculating the checksum of all messages involved in establishing a connection
- Functions for working with descriptions in ASN.1 in DER encoding. Required to retrieve the public key from an X.509 certificate

An RSA implementation based on PKCS v1.5 is used. The key length can be any and is determined by the certificate.

#### SSL messaging

Absolutely all transmitted messages start with a 5-byte header, the first byte of which contains the message type, the next 2 bytes are the protocol version number (0x03, 0x03 for TSL 1.2) and then the length of the remaining part (content) of the message is 2 bytes in Big Endian, by defining message boundaries.

Thus, when receiving new data, you should first read 5 bytes of the header, and then how many bytes are contained in the length field.

There are 4 types of messages:

- 0x17 - data. The content is encrypted HTTP messages, in our case using AES256, the key of which is calculated during the connection setup. Data size must be a multiple of 16 bytes
- 0x16 - establish a connection. Several types defined by the corresponding field inside the content. Unencrypted, with the exception of messages of the type 'finished' sent by the latter.
- 0x15 is a warning. The message that "something went wrong." Close the connection. Contains codes of what exactly went wrong, can be used for debugging.
- 0x14 - change the cipher. Sent immediately after the key is agreed. The content represents 1 byte, always containing 0x01. In fact, it is part of the connection setup process.

In our implementation, the encrypted data is as follows:

16 bytes IV for CBC, in TSL 1.2 each message has its own IV;

data length up to 64K is the length of the headers;

32 bytes MAC, calculated for the 13-byte header and data, the header consists of 8 byte serial numbers starting from zero, message type (0x17 or 0x16), version and data length. Everything at BigEndian. The key for the HMAC is also calculated during connection setup;

placeholder, so that the length of the encrypted data is a multiple of 16 bytes, the last byte contains the number of bytes of the placeholder without regard to it. If the message length is a multiple of 16 bytes, another 16 bytes will be added for the sake of this last byte with a length.

#### Establish a connection

During the installation process, we must solve two problems:

- Reconcile and calculate keys for encryption and HMAC
- Send the correct sequence of messages so that the other side does not close the connection, but switches to data exchange mode

In our case, the message sequence is as follows:

ClientHello -> (0x01)

<- ServerHello (0x02)

<- Certificate (0x0B)

<- ServerHelloDone (0x0E)

-> ClientKeyExchange (0x10)

-> ChangeCipherSpec

- > Finished (0x14)

<- ChangeCipherSpec

<- Finished (0x14)

where "->" means sending a generalization, and "<-" - receiving.

All messages except ChangeChiperSpec are a message of type 0x16 - connection establishment. The content of this type of message starts with its own 4-byte header, the first byte of which is a type of connection setup message, as indicated above, and 3 bytes of the length of the remaining message,

Consider these messages in detail.

##### Clienthello

The first message that we send to the server after a successful connection. Since we use one specific set of ciphers, in our case it will be constant. Like this:

```
static uint8_t clientHello[] =
{
0x16, // handshake
0x03, 0x03, // version (TLS 1.2)
0x00, 0x2F, // length of handshake
// handshake
0x01, // handshake type (client hello)
0x00, 0x00, 0x2B, // length of handshake payload
// client hello
0x03, 0x03, // highest version supported (TLS 1.2)
0x45, 0xFA, 0x01, 0x19, 0x74, 0x55, 0x18, 0x36,
0x42, 0x05, 0xC1, 0xDD, 0x4A, 0x21, 0x80, 0x80,
0xEC, 0x37, 0x11, 0x93, 0x16, 0xF4, 0x66, 0x00,
0x12, 0x67, 0xAB, 0xBA, 0xFF, 0x29, 0x13, 0x9E, // 32 random bytes
0x00, // session id length
0x00, 0x02, // chiper suites length
0x00, 0x3D, // RSA_WITH_AES_256_CBC_SHA256
0x01, // compression methods length
0x00, // no compression
0x00, 0x00 // extensions length
};
```

This message tells the server that we support TLS 1.2, this is a new connection (the length of the session identifier is zero) and we support the only cipher suite - RSA with AES256. We also pass a set of 32 “random” bytes for generating keys. If these bytes are really random, then they should be remembered somewhere, because they will be needed in the future.

##### Serverhello

“Twin brother” ClientHello, except that the message type is 0x02 instead of 0x01, and a non-empty session identifier. From this message we need only 32 random bytes.

##### Certificate

It can contain several certificates, first comes the length of the entire group of certificates, then each certificate has its own length. We are only interested in the first certificate and read the length 2 times. The certificate itself is a DER encoded X.509. From it we need the RSA public key.

##### ServerHelloDone

It does not contain anything useful, but it is taken into account when calculating the hash for Finished.

##### ClientKeyExchange

At this point, we have enough information to generate and reconcile keys that occur in 3 stages: generating a random secret key, calculating a master key, expanding the master key to obtain encryption keys and checksum.

The random private key is 48 bytes, the first 2 of which are the version number (0x03, 0x03), and the remaining 46 are randomly generated. Further, these 48 bytes are encrypted with the RSA public key, and together with the length of the encrypted block are sent to the server. It should be noted that the length of the encrypted block will be equal to the length of the key, not 48 bytes. For example, for certificates with a 2048-bit key, this length will be 256, and the length of the transmitted data will be 258.

##### ChangeCipherSpec

Dispatched immediately after ClientKeyExchange. Always the same:

```
static uint8_t changeCipherSpecs[] =
{
0x14, // change cipher specs
0x03, 0x03, // version (TLS 1.2)
0x00, 0x01, // length
0x01 // type
};
```

This message is of type 0x14 and hash calculation for Finished is not involved.

##### Pseudorandom Function (PRF)

For further calculation of the keys, we need a pseudo-random function that takes 4 parameters at the input: the secret key just sent to the server, a label in the form of a text string, a block of initial data and the desired length of the result.

In TLS 1.2, it is defined as follows:

PRF (secret, label, seed) = P_SHA256 (secret, label + seed);

P_SHA256 (secret, seed) = HMAC_SHA256 (secret, A (1) + seed) +

HMAC_SHA256 (secret, A (2) + seed) +

HMAC_SH256 (secret, A (3) + seed) + ...

where A is determined by induction

A (0) = seed,

A (i) = HMAC_SHA256 (secret, A (i -1)).

That is, at each step, we recalculate the checksum from the previous step, and then calculate the checksum from combining the result with the text string and the initial data, repeating this until the desired length is obtained.

Now the master key is calculated using the

PRF formula (secret, “master secret”, clientRandom + serverRadom, 48);

where clientRandom is 32 random bytes from ClientHello, and serverRandom is from ServerHello.

Then it should be expanded to a 128-byte block containing 4 32-byte keys in the following sequence: MAC key to send, MAC key to receive, encryption key to send, decryption key to receive.

The MAC key is not used by us to receive.

Key expansion is performed according to the formula

PRF (masterSecret, "key expansion", serverRandom + clientRadom, 128)

clientRadom and serverRadom are interchanged here.

##### Finished

At this point, we have everything we need to start exchanging data, but, unfortunately, we must send a Finished message containing the correct data, otherwise the server will disconnect.

If all the previous posts were pretty trivial, then Finshed is more complicated. Firstly, it is of type 0x16, but its contents are completely encrypted, while when calculating the checksum, 0x16 also appears, and not 0x17 as for other encrypted messages.

The message itself contains the first 12 bytes from

PRT (masterSecret, "client finished", hash, 12)

where hash is SHA256 from the following message sequence:

ClientHello, ServerHello, Certficate, ServerHelloDone, ClientKeyExchange. All messages are counted without a 5 byte header.

If the message is generated correctly, the server will respond with ChangeCipherSpec and Finished, otherwise an error message.

After that, we server is ready for data exchange and we send our HTTP request and get a response.

#### conclusions

The approach considered in the article allows you to effectively work with https for applications that do not require its full implementation. Instead of third-party ssl implementations that draw their own cryptography, you can use the one already present in the project, as shown in the example of crypto ++, which reduces the number of dependencies, improves support and portability.

Implemented and used almost in i2pd - C ++ implementation of I2P