MetaPhone: The Importance of Telephone Metadata
How important is metadata when using a phone? Discussions of this issue erupted with renewed vigor after last year's revelations of Edward Snowden. The government is considering introducing a variety of bans on access to such information; Privacy was also considered in the US Federal Communications Commission (FCC).
President Obama emphasized that the NSA "did not delve into the content of the conversation." “Only metadata was used,” Senator Feinstein told reporters. Rejecting the American Civil Liberties Union (ACLU) lawsuit, Judge Pauley described the possible legal consequences of another decision as a “horror parade”.
On the other hand, many scientists and IT professionals have expressed their concern about the risk posed by the disclosure of metadata. Ed Felten, in a statement to the ACLU, provided an exhaustive explanation for this: “Telephone metadata can help fully reveal the user's identity. Both at the level of individual calls, and (especially!) In an aggregated state. ” Judge Leon, recognizing that the NSA surveillance program is likely to be unconstitutional , agreed with Felten’s point of view and noted that “metadata from the phone of a bastard’s person can provide information on his marital status, political and religious views, and sexual preferences.”
Accordingly, there is a certain gap and two opposite points of view. Is metadata easy to get important personality information? Do people often trust their phone with extremely personal information, which can then be obtained through metadata?
We used data from various sources to find empirical answers to these questions. Since last November, we conducted a study on the security of telephone metadata. Participants in the experiment launched the MetaPhone application on their Android smartphones .. It collects device logs and other social information, which is then redirected for further analysis. Using the data received through MetaPhone, we were able to correctly determine the status of human relationships, understand the relationship of call graphs when making calls, and evaluate the identifiability of telephone numbers.
At the beginning of this study, we shared the point of view of colleagues from the IT sphere - telephone metadata can reveal very important and sensitive information about a person. However, we especially did not hope to find any irrefutable evidence in favor of this or that version, since the number of MetaPhone users was not so great, and it was planned to monitor telephone activity only for several months.
We were very wrong. We found that the metadata stored on the phone contains extremely sensitive information, and you can get it even by tracking the phone for a short period of time. We were able to obtain data on the health status of phone users and on the possession of firearms in their possession - all thanks to metadata alone.
The first step was to identify the contacts of MetaPhone users. Here we used an approach in which phone numbers are mapped to public data from Google and Yelp. A total of 546 participants in our experiment contacted 33,688 telephone numbers. We managed to determine the identity of the owners of 6,107 rooms (18%).
Then we noted contacts that were more likely to relate to some important information. In most cases, we managed to find out, for example, the name of the company the person contacted, from which it usually became clear what the company was doing. In the event that it was not possible to find out the type of activity of the company by one name, Google came to the rescue.
In the end, we managed to collect two groups of results. Firstly, we analyzed individual calls to important numbers. Secondly, we compared different call patterns to get information about the caller’s personal life, available from metadata.
Individual Call Analysis Results
Many organizations carry out some narrowly defined range of tasks, so a call to the numbers of these companies immediately carries quite sensitive information. If a person, for example, calls the candidate’s election headquarters, then with a high probability it can be argued that he supports him. In the same way, if a person often talks with someone who uses the number assigned to some religious organization, his religion becomes clear. You can even find out which particular church he attends.
We were able to collect information about a large number of calls, from which we can draw just such conclusions. The table below presents data on the proportional number of experiment participants who made at least one phone call to the numbers of “sensitive” organizations:
Information about religious organizations has given us the opportunity to verify the accuracy of our assumptions. MetaPhone takes information about the user's religion directly from his Facebook profile, which allows (in case the religion is indicated in the profile) to directly compare the assumptions made on the basis of the received phone metadata with the exact data from Facebook. We had 15 people with clear information about religion in their profile (including atheism), and telephone contacts with religious organizations. Assuming that the religious organization to which the person most often calls and reflects his religion, we were able to accurately determine the religious status of 11 of our 15 volunteers (accuracy 73%).
Many phone numbers could be associated with specialized products and services, and even figure out a specific line of business. In medicine, for example, we were able to break down phone numbers into categories corresponding to the diseases that are being treated in a particular institution.
The degree of importance of the data that can be obtained from the user's contacts caught us by surprise. Our subjects called the organization of anonymous alcoholics, arms stores, organizations supporting the right to abortion, trade unions, called the lawyers for divorces, sexually transmitted diseases clinics, strip clubs, and this is not a complete list. This is not a hypothetical “horror parade”, but only simple information about the owners of the phones, which can be easily and simply obtained on an industrial scale.
Call Pattern Matching Results
Call patterns often produce information that is much wider than just a list of numbers that a person called. In the course of our study, we were able to identify call patterns that allow us to make highly accurate assumptions about the sensitive nature of such contacts. The examples below were obtained from our data set using the identification of telephone numbers using public means [identification]. Despite the fact that most MetaPhone users have given permission to disclose their identity, we will still use pseudonyms.
- Participant A called various neurological clinics, specialized pharmacies and a hotline dedicated to the treatment of multiple sclerosis.
- Participant B spoke several times with cardiologists at a large medical center, had brief conversations with medical laboratory staff, received calls from a pharmacy, and also called the hotline for monitoring cardiac arrhythmia.
- The participant called the arms store several times, which specializes in products based on the AR semi-automatic rifle platform . In addition, we managed to track the call to the technical support line of the manufacturer of these weapons.
- For three weeks, participant D contacted a store of equipment and materials for repairs, a manufacturing company of locksmith products, a dealer of hydroponics, and a store of tobacco products and smoking mixtures.
- Participant E had a long conversation with her sister in the early morning. Two days later, she called the family planning organization several times. Two weeks later, she also called several times, the last call happened a month later.
We were able to confirm the diagnosis of participant B and the fact of possession of weapons by participant C, using information from public sources. Due to the delicacy of the information received, we did not begin to apply for confirmation to participants A, D and E.
The data set that we analyzed in this report covered hundreds of users over several months. The NSA and telecom operators have information about millions of people over the years. You can talk about the need to impose restrictions on access to such information. One thing is certain for sure - with the help of metadata you can get very important and sensitive information about a person.