Why Telegram Passport is No End to End

    Hi% username%!



    In the discussion of the news about Passport, there were heated discussions on the topic of security of the latest crafts from Telegram authors.

    Let's take a look at how it encrypts your personal data and talk about the real end-to-end.

    In a nutshell, how Passport works.

    • Locally, you use a password to encrypt your personal data (name, email, scan of passport, other documents).
    • Encrypted data + meta information is uploaded to the Telegram cloud.
    • When you need to log in to the service, the client downloads the data from the cloud, decrypts it with a password, re-encrypts the public key of the service that requested the information, and sends it.

    We will consider the first part, which concerns the encryption and storage of personal data.

    End to End, according to the developers, is that the Telegram cloud allegedly cannot decipher your personal data, but sees only “random noise”.

    Let's take a closer look at the code of the encryption algorithm for personal data from the desktop client, which is located here and see if the result of its work satisfies the End-To-End criteria.

    It all starts with a password. Here is the place where it turns into an intermediate encryption key.

    bytes::vectorCountPasswordHashForSecret(
    		bytes::const_span salt,
    		bytes::const_span password){
    	return openssl::Sha512(bytes::concatenate(
    		salt,
    		password,
    		salt));
    }
    

    Here, a random salt is taken, concatenated twice with a password and run through the SHA-512 hash. At first glance, nothing unusual. But!

    In the yard in 2018. On one good GPU, you can sort out about one and a half billion SHA-512 per second. 10 GPUs will sort out all possible combinations of 8-digit passwords from a 94-character dictionary (English letters, numbers, special characters) in less than 5 days.

    Long ago, there are ways to make life difficult for those who sorted through passwords on the GPU, but the Telegram developers decided not to bother to implement them.

    Farther. Another almost random key that is generated like this is encrypted from the password hash :

    bytes::vectorGenerateSecretBytes(){
    	auto result = bytes::vector(kSecretSize);
    	memset_rand(result.data(), result.size());
    	constauto full = ranges::accumulate(
    		result,
    		0ULL,
    		[](uint64 sum, gsl::byte value) { return sum + uchar(value); });
    	constauto mod = (full % 255ULL);
    	constauto add = 255ULL + 239 - mod;
    	auto first = (static_cast<uchar>(result[0]) + add) % 255ULL;
    	result[0] = static_cast<gsl::byte>(first);
    	return result;
    }
    

    and is used to encrypt data along with another piece, which is described below.

    Randomly it is “almost” because the telegraph developers have never heard of HMAC and AEAD and instead of using normal means to check the decryption correctness, they make the remainder of dividing the key byte amount equal to 239, which when decrypting and checking :

    boolCheckBytesMod255(bytes::const_span bytes){
    	constauto full = ranges::accumulate(
    		bytes,
    		0ULL,
    		[](uint64 sum, gsl::byte value) { return sum + uchar(value); });
    	constauto mod = (full % 255ULL);
    	return (mod == 239);
    }
    

    First of all, this array of bytes is not so random . Secondly, while going through, there will be a lot of false positives, but counting the sum of bytes after decryption is much easier than HMAC, so this ingenious design from all sides serves to accelerate the brute force rather than benefit.

    Go ahead. Directly method that encrypts data. There are a lot of letters, so in pieces:

    EncryptedData EncryptData(
    		bytes::const_span bytes,
    		bytes::const_span dataSecret){
    	constexprauto kFromPadding = kMinPadding + kAlignTo - 1;
    	constexprauto kPaddingDelta = kMaxPadding - kFromPadding;
    	constauto randomPadding = kFromPadding
    		+ (rand_value<uint32>() % kPaddingDelta);
    	constauto padding = randomPadding
    		- ((bytes.size() + randomPadding) % kAlignTo);
    	Assert(padding >= kMinPadding && padding <= kMaxPadding);
    	auto unencrypted = bytes::vector(padding + bytes.size());
    	Assert(unencrypted.size() % kAlignTo == 0);
    	unencrypted[0] = static_cast<gsl::byte>(padding);
    	memset_rand(unencrypted.data() + 1, padding - 1);
    	bytes::copy(
    		gsl::make_span(unencrypted).subspan(padding),
    		bytes);
    

    Here from 32 to 255 random bytes are added to the data. This is done to diversify the
    dataHash variable. This is a hash of unencrypted data mixed with random bytes.

    constauto dataHash = openssl::Sha256(unencrypted);
    	constauto bytesForEncryptionKey = bytes::concatenate(
    		dataSecret,
    		dataHash);
    	auto params = PrepareAesParams(bytesForEncryptionKey);
    	return {
    		{ dataSecret.begin(), dataSecret.end() },
    		{ dataHash.begin(), dataHash.end() },
    		Encrypt(unencrypted, std::move(params))
    	};
    }
    

    This is where the encryption key for personal data is generated. It is obtained by using another SHA-512 call from an almost random key generated above, concatenated with dataHash.

    Total


    Transmit to the cloud:

    1. Hash from personal data mixed with random bytes
    2. Password encrypted almost random key
    3. Salt
    4. Encrypted data

    This is not "random noise", there is everything you need, including an encryption key, password-protected. And it allows you to get to user data much, much faster than to sort through all possible combinations of AES keys (2 ^ 256).

    Also, such mechanisms invented by the authors of Telegram as checking the key for validity using the sum of bytes, the participation of the data themselves in the formation of the encryption key and the hash from the data instead of HMAC are also in big doubt.

    Approximate brute-force algorithm:

    1. We take the password in order, we generate a hash from it and salt (GPU)
    2. We try to decrypt the key (AES-NI)
    3. We look at the amount of bytes and at once we weed out almost all the wrong passwords.
    4. We form a candidate key for data decryption using another SHA-512 (GPU) call
    5. We try to decrypt the first data block (AES-NI)
    6. In order not to waste time on full decryption and another SHA-256, we can speed up brute-force by checking the first alignment byte just like they do:

    if (padding < kMinPadding
    		|| padding > kMaxPadding
    		|| padding > decrypted.size()) {
    

    So, we see that the encryption of personal data is critically dependent on the complexity of the password. All stages of search are well accelerated by hardware. Either using the GPU, or using the AES-NI instructions. Of course, you can set a long, secure password and hope that it will. But what do you think, what percentage of the two hundred million telegraph users will make passwords longer than eight characters?

    Add to this the dubious techniques of generating keys and checking the validity of the decrypted information, which do not use the standard proven mechanisms, but directly violate the principle of Don't roll your own crypto and it will become clear that this is not a hand-crafted piece made from which smells unpleasant.

    By the way, the lack of a digital signature allows the telegram not only to get the users' personal data, but also to replace them with any others, such as terrorists.

    Real End-to-End


    E2E is called so, because it allows you to show encrypted data to third parties without fear for their safety. As we have seen, this condition is not satisfied with the new telegram product.

    But, for example, if you correctly encrypt data not on a password hash, but on a public key, then even a billion-dollar cluster will not be able to get close to it. Take a look at Signal, other messengers based on it (WhatsApp, etc). The whole world has long been successfully using modern asymmetric cryptography, algorithms that interfere with the search for passwords, strong standard cryptographic structures.

    Not the first year there are much more seriousdata protection systems using passwords that do not even allow bruteforce to start. Because the attacker will not have enough data for this.

    But Telegram went its own way, special by reinventing crypto-primitives and weakening protection. Well, what, they poured the money, you can not worry about the consequences.

    PS If you want to see how chat works with a real E2E, VirgilSecurity has a demo project that can be downloaded and played.

    Only registered users can participate in the survey. Sign in , please.

    Will you use Telegram Passport?


    Also popular now: