Gravatar: how to decrypt email addresses of all users?

From the sandbox

Many people have heard about Gravatar, and some use it. If you haven’t heard, then Gravatar is a globally recognized avatar - an image that is attached to your email address and which you can use on other sites when commenting, filling out a profile, etc.

The Gravatar service turned out to be quite popular and demanded by the public and over the years has turned into its mini-social network with a multi-million audience. However, how many Gravators there are (that is, users who have attached an avatar to their email address) I have not found the information. Only on the official website, the creators of the service proudly declare that "Millions of avatars are shown more than 8.6 billion times a day."

The popularity of Gravatar is evidenced by the fact that it is supported by many popular engines, such as Redmine, W-script and, of course, WordPress. There are plugins and modifications that allow you to integrate support for globally recognized avatars in systems such as Drupal, Joomla, MODX, SMF and phpBB.

Gravatar works according to the scheme: the user logs in to the Gravatar service and saves his avatar and email address there. From now on, he has his own Gravatar. Now, when he wants to leave a comment on a site or blog, he only indicates his email address. The site script encrypts this email address and sends it to the Gravatar server, from where the avatar picture is returned.

If we open the original page with the Gravatar picture, we will see the following avatar address: the

32-digit hexadecimal number in this address is the MD5 hash of the email address itself and, in fact, this is the only key that identifies the user in the Gravatar system.

The MD5 hash algorithm used by the Gravatar service is primarily intended to hide the user's email address (do not send it in plain text). Its feature is “one-way”, that is, with the MD5 function we can get (encrypt) the hash (fingerprint) of a word, but we cannot get it (decrypt) back. That is, to understand what exactly we have encrypted into the hash "05933ec7a23f6ebd2017490abfbcd3f3" is impossible by any mathematical function.

However, there are MD5 “decryption” methods, such as dictionary search, rainbow tables, etc. In this case, the question arises of the safety of the email address of the user Gravatar. However, the user himself can make his e-mail address public, and he is also known to the administrator of the site on which he leaves a comment. But back to the efficiency of using the vulnerable hashing algorithm (MD5) by the Gravatar service.

How safe is it and how realistic is it to decrypt ... all gravatars ?

To answer this question, I decided not to “crack” MD5 hashes (which seemed time-consuming to me), but to check for the presence of an email address in the Gravatar database. The principle is extremely simple: we check the email address for the presence of a gravatar, if there is a gravatar, we enter the MD5 hash into the database.

For trial and error, for these purposes, the optimal request to the Gravatar service was selected at the address with the parameter:

www.gravatar.com/avatar/HASH?d=404

When contacting such an address, the Gravatar service will return a 200 response if the user has a gravatar (if such a user exists at all) and a 404 answer if the user is not in the Gravatar database. In this case, we write a script to check the server response:

$email = "адрес@электронной.почты";
$hash = md5(strtolower(trim($email)));
$url = 'http://www.gravatar.com/avatar/'.$hash.'?d=404';
$check_url = get_headers($url);
if (strpos($check_url[0],'200')){
//получили ответ 200 - такой пользователь есть, записываем его MD5-хэш в базу
}

So, we learned how to check for the presence of a gravatar at the email address. As a raw material, I downloaded the first available database of email addresses from the Internet (regular spam databases, as well as email addresses that got into the search results in clear text) in an amount over 10,000,000 (cleaned from duplicates, checked for validity, etc. ) I installed a regular local server (Denver) on a regular computer, made the above script multithreaded (I achieved a scan speed of about 2 million addresses per day). Surprisingly, in spite of the monstrous requests to the Gravatar service, the latter did not block the script and regularly returned data at the entire stage of the experiment.

During the week, all 10 million addresses were checked, and the result of work was recorded in a database having the following structure:

email address (which has a gravatar)
MD5 hash of this email address
user login Gravatar

The Gravatar user’s login is required to get a link to the user’s profile in the service, from where you can get additional information about the user. Such url has the structure:

www.gravatar.com/LOGIN

You can get the login when accessing the import file of the form:

www.gravatar.com/HASH.php

We’ll write a script that will find the variable we need, called preferredUsername

$email = "адрес@электронной.почты";
$hash = md5(strtolower(trim($email)));
$str = file_get_contents('http://www.gravatar.com/'.$hash.'.php');
$profile = unserialize($str);
if (is_array( $profile) && isset( $profile['entry'] ))
$login = $profile['entry'][0]['preferredUsername'];

The variables were received ( $ email, $ hash and $ login ) and entered into the database for later search in it. And it's all? In fact, yes. Brevity is the soul of wit. We fasten the search to the database and, voila: the service is ready . Now, when you enter an MD5 hash in the search bar, which can be taken on any site where the user left a comment, we can get his email address. For convenience, I implemented the Drag & Drop technology (drag and drop) - just drag and drop the gravatar picture from any site into the search box and click “find”.

Explanations: The experiment and service were not created for any malicious purposes (for spam, etc.). Also, no one can be held responsible for the safety of Gravatar's email address - the user is aware that he is visible to the administrator of the site on which the comment is left. Taking care of the safety of personal data, I limited the search results, closed the data from getting into the search index, etc. The service, which was impromptu as a result of the experiment, was made for people for reference and contact purposes. And also, as information for consideration, for holders of the Gravatar service.

Summary: a typical computer for a week went through / checked 10 million email addresses (taken from open sources). Only 3% (about 300,000 recognized MD5 hashes) of them turned out to have their own Gravatar (not a lot). But theoretically, all email addresses of all Internet users can be collected into a single database for subsequent verification using the described method. And also theoretically, all MD5 hashes of the Gravatar service can be computed. Everything is much more than 10%, which can be obtained by enumerating MD5 hashes . The encryption algorithm of email addresses in Gravatar is vulnerable, using the service, you need to consider this.

Tags:

Gravatar: how to decrypt email addresses of all users?

Also popular now: