Waging war against spam

Original author: Dick Craddock
  • Transfer
imageEvery mail system is faced with spam, Hotmail is no exception. Our weapon against it - the SmartScreen spam filter - is one of the most effective in the industry. This post will give you an idea of ​​how we use SmartScreen to deal with a threat called spam, and how you can help us with this.

Spam is war



Why do people send spam? It's simple, it brings money. Spam is a large, very large business, most of which is illegal, but this does not stop people from sending it.

Trust me, spam is a battle. Spammers are very smart, they will never disappear. They are constantly inventing more and more new ways to use our mail for their own purposes. But we do not give up.

Various studies, including the monthly Symantec reportspam reports show that more than 90% of all emails sent over the Internet are spam. As a result, letters sent to Hotmail, like other email providers, are mostly spam. With an active user base of 350 million users, Hotmail is a major target. As a result, we receive several billion spam messages per day.

But Hotmail removes 98% of all spam before it can reach your inbox. Let's talk about how we achieve this and how we strive to improve it.

Know the enemy


First, let's give some definitions. Spam is a commonly used term for unsolicited commercial email sent to a large number of recipients without legal justification. Nobody wants spam to come.

It should be noted that not all spam is spam. For example, you receive newsletters or offers to purchase as a result of registration on a completely legal and fair site. You may or may not want to see them in your inbox, but these letters will be legal, you yourself have subscribed to them! We call such letters gray, because it is not clear whether you want to see them in your mail or not, these letters are neither “white” nor “black”, hence the name (from the translator: after the comment, apparently because of this in the user interfaces of mail programs and Microsoft services, there is no word for spam anywhere, there is Junk and Junk mail).

Our goal is to eliminate spam as much as possible. But we must avoid mistakenly marking good emails as spam. We call this type of error a false positive.

Thus, this is a real trick - eliminating spam as much as possible, ideally, while minimizing the number of false positives, ideally not a single one. In a sense, these two goals contradict each other.

All this in numbers



The generally accepted engineering wisdom says: what we cannot measure, we cannot improve.

In Hotmail, we track a few very similar metrics. Every day we monitor the SITI indicator (“spam in the inbox”), and we also monitor the percentage of SITI that is spam, excluding gray letters. We also monitor how often we make mistakes by placing normal messages in the Junk Mail folder.

In addition to automated means of measuring our work, we use feedback received from customers. If you notice a message in your inbox that you think is spam, you can mark it as inappropriate. Accordingly, if, in your opinion, a perfectly normal message has got into the Junk folder, you can mark it as not unwanted or just drag the message from this folder.

Most emails, about 75%, marked by users as spam, are actually gray, i.e. legal letters, but which users do not want to see in their mail, and therefore mark them as unwanted. A good example of gray letters is newsletters or notifications that you subscribed to when shopping on a site, but which you are not really interested in.

So what are we doing? Let's go back to 2006 when we had some problems with spam. The share of Tru spam was approaching 35%, which means that every third letter in your mailbox was spam. Since then, we have made tremendous progress, dropping the percentage of spam below 5% and keeping it at that level. The following graph shows the trend of spam over the past few years all over the Internet, as well as in Hotmail. Green triangles on the Hotmail chart indicate the introduction of new anti-spam technologies.



You can see that at a time when the proportion of spam on the Internet was growing, the investments made in Hotmail really paid off. Now we are observing not only the historical minimum of the share of spam, but also the best indicator of false positives.

SmartScreen: our anti-spam weapon



We have achieved such results by making huge investments in our SmartScreen technology. Let's talk about some of its components.

Filtering at the time of connection (Connection-time filtering). This is our first “defensive line." At any given time, our system has an idea of ​​the reputation of mail senders around the world, as well as the latest trends in the contents of letters, based on various sources. Sender reputation is mostly related to an IP address or a range of addresses. Based on this data, we set a limit on messages that a specific sender can deliver to Hotmail. Setting this value to zero allows you to block all mail addresses of this sender. For good letters, we set a limit in such a way that it does not interfere with the normal receipt of letters, while minimizing the potential for abuse of the sender’s address if his computer is hacked. We use several sources to evaluate the sender's reputation:
  • IP addresses of bots. We track individual computers that were used to send spam. Often these are malware infected computers that are part of the botnet.
  • Dynamic IPs (Dynamic IPs). We know that computers with dynamically obtained IPs should not send mail, so we immediately block mail sent from such computers.
  • Known spam networks (Known spam entities). We use additional information, such as an autonomous number system and IP address registration , to track the ranges of addresses that were used to send spam
  • Third-party sources. We agreed with third parties to use the best that is in the industry.image
  • Content Filters We have many filters through which we pass incoming mail, which, by analyzing the contents, can identify the message as spam. It is not as easy as searching for the phrase “watch replica”. Our SmartScreen system uses self-learning to adapt to the trends and technologies used by spammers. The filtering system applies tailored policies, content filters, and reputation based on the sender class. Filters detect spam with a certain degree of accuracy. When we are absolutely sure that the message is spam, we delete it. Otherwise, we put it in the Junk Mail folder. Our content filters remove approximately 1 billion posts per day.
  • Your preferences. You control spam too! You can set black and white lists and rules, which we will use for additional filtering of letters.
  • Time-traveling filters Yes, you read it right. We can travel in time ... Well ... Our filters can. Everything is pretty simple. We cannot always find out about a new spam source as soon as it appears. But as soon as we find a spammer, we can go back in time and delete this spam before you notice it in your inbox. We call our tool filters that travel in time, because in a sense we are able to go back in time and get rid of spam, even after we missed it! (Of course, if you have already noticed this spam, we cannot delete it. Otherwise, it would create a time travel paradox that could break our brains)
  • Malware detection We check email attachments for known malware and viruses.
  • Tools in the Hotmail user interface (Tools in the Hotmail UI). Finally, we provide powerful spam tools directly to the mail interface. We display a security bar whenever you read a potentially dangerous email. Links and images are disabled by default for unknown and untrusted senders to protect you from bad links and web beacons . You can help us by marking bad emails as junk or by dragging them to the junk mail folder. And also reduce the level of false positives by moving good letters from the Junk Mail folder. Whenever you mark a letter as “desirable” or “unwanted” our system becomes smarter


How can you help us?



Our system is only part of the solution. We look forward to user feedback in the fight against spam. Here are some ways to make our system smarter, as well as contribute to the state of the mail ecosystem:
  • Give Feedback Based on Your Experience There are three ways to get feedback. You can mark letters as “Desirable”, “Unwanted”, “Fraudulent”, thus making our filter smarter. By marking some letters as “Not Unwanted” you help us identify false positives, which makes it possible not to repeat the error in the future
  • Participate in a feedback program. From time to time, we invite some users to participate in our feedback program. The program works as follows: from time to time we send you a letter and ask you if it is undesirable. The classification of our spam filters depends on how you classify this letter. Agree to participate in this program if you are offered
  • Do not buy anything from spammers. A very small number of people follow the links provided in the spam emails. But spammers make money because of the very large number of messages sent. A typical spammer can have very good profits, even if only 50 messages out of a million messages are sent by the user.
  • Check your computer for malware. Make sure that your computer itself is not a spambot! You can use free antiviruses like MSE for this.


A look into the future



Over the past few years, the Hotmail team has made very important investments in the development of SmartScreen to not only solve spam problems, but also to be the best in the mail service industry.
In the next post, I will talk a little about the problem of gray letters, take a deeper look at the filtering mechanisms, and give some tips to those who are still experiencing problems with spam.

Until then, I hope you continue to use Hotmail and leave your comments in the comments .
Dick Craddock
Group Program Manager, Windows Live Hotmail

Also popular now: