
SendGrid to the Chinese is not a decree

Today I would like to talk about one story related to the use of SendGrid . In the process of investigating the reasons, I had to talk to the support service, and the problem, in general, still could not be solved, but I continue to sort it out and hope that one of the readers offers a good solution to this problem. Well, for those who are just going to use systems that advertise themselves as transactional mail delivery ( transactional email - SendGrid, MailGun, Mandrill), I hope this post will help to understand which problems they help to solve and which not, and will give an understanding of the appropriateness of using such systems in principle.
Since last year, we have been developing and supporting one of the project management systems with a team of developers located in America, Australia, Bulgaria and Ukraine. To send notifications, SendGrid is used. It is quite obvious what kind of notifications are sent by such a system - registration, confirmation of mail, password recovery - but basically these are notifications about system updates by other users - a new comment, a new task, etc. etc. I will say right away that we had some experience with SendGrid. By adding functional monitoring capabilities to Nerrvanawe began to use SendGrid, but the amount of mail sent by us is incommensurably less than that sent by the project management system, and therefore here we first encountered problems when using it.
The customer is located in China, and, ironically, it is a leading email marketing company. Domain .asia. Twenty employees from this company are registered in our project management system. Some employees complained that they were not receiving notifications. Then I got into the SendGrid interface and started my investigation.
Here is what I saw (screenshot):

It turned out that some users receive notifications, others do not. For non-received notifications, SendGrid marks “Drop” with the reason “Bounce”. It was very strange - how are these users different for the mail server of this domain from others? The concept of “Bounce” turned out to be new to me, and I also decided to first understand what it means. If this is a generally accepted standard - to understand it, if not - then read what sense SendGrid puts into it.
It turned out that “Bounce” means that the mail server accepted the mail, but could not deliver it. It remains to find out why these “Bounce” occur, and I contacted SendGrid support with a question why this is happening and how the two types of Bounce that I found on sendgrid.com/logs/index are differentplaying with filters:

In response, I received a link to the documentation page - sendgrid.com/docs/Delivery_Metrics/index.html , and also found out that SendGrid divides Bounces into soft and hard. They also pointed me to sendgrid.com/bouncesthat I hadn’t reached yet. On it you can find when the address was in the Bounce list, and you can delete the addresses in this list. Here for the first time I thought that there should be an automatic way to do this, since it would be unrealistic for our volumes to view lists, analyze and clean them manually. I was told that SendGrid does not send all subsequent letters to addresses that are on such a list until we ourselves remove it from this list. “Gee!” - in a non-literary form, I thought, and again wrote to the support service. There were a lot of questions - although it would seem that SendGrid could have described it in detail in the documentation. The lack of funds as if they should not be, according to CrunchBase .
In my opinion, it would be quite logical in the activity log to say this:
- this attempt to send to the addressee “bruce.lee@our_client_domain.asia” led to the answer “to such-and-such” -> I put the address in the list “Hard bounce”
- this attempt to send to the addressee “bruce.lee@our_client_domain.asia” is ignored with the status “ Drop ", since the address is already in the" Hard bounce "list. In order to view all the attempts that were dropped and the primary server response, go to sendgrid.com/bounces/bruce.lee@our_client_domain.asia .
Then everything is simple and clear - it can be seen why, when and where that got and how much is already there. You can see how soft and hard bounce are indicated and how the result of the last hit in one of the bad lists is shown in the interface. A link appears between the activity log and the invalid, bounce, spam lists available on the Email Reports page. That is, in all cases there is a root cause and consequences. So show it humanly! The questions that I have just will not appear in this case.
To prevent mail from falling into the list, I was advised to add domains to the Address Whitelist applicationavailable for our subscription level - Silver, but this is also not an option. Continuing to send mail to the provider, which included you in the black list (black listed you) without clarifying the reasons, as an approach, we were not happy.
Further it was even more interesting. In the reports, I found the root cause - “550 Connection frequency limited”. I knew from the client that their ESP is the largest ISPin China, QQ.com, and also that the client added our domain to the white list, and this did not help. The client threatened to go to Basecamp (which, of course, is unpleasant). Further, the client shared information that QQ has a limit on the amount of mail received by each user. This explained why some users received mail from us, while others didn’t. The client explained that QQ does not allow small ESPs (in this case, to us) to send large amounts of mail to their users. A reasonable question arose - who is the ESP in this case, we or SendGrid? It turned out that we are, and this is completely our problem. QQ has established, for example, that all senders (except those they consider large) can send no more than 10 letters per day to each user. Apparently, as soon as one of our users receives from us these normalized QQ-10 letters, we get further [550 Connection frequency limited] “error”, as the servant of Erast Fandorin - Mas would say, and we are waiting for the new day to come in China to send another ten. In addition to this, we also get to the SendGrid bounce list and sending mail to this user stops until we remove the address from this list (we know that we will get there regularly).
By the way, if you search on the line “550 Connection frequency limited” in Google, you will immediately see that all the links either mention QQ.com, or are pages of QQ.com itself. That is, this is a known issue. One would like to say - "I recognize my dear by gait, and QQ - by Connection frequency limited."

Why does SendGrid know nothing about this and do not warn - “you guys are sending QQ mail, damn it, keep in mind that ....”? Why does SendGrid not agree with QQ that QQ accept mail without restrictions for their (SendGrid) clients, or at least take on the role of an intermediary, being a major player in this market?
Further, a client from China advised us to predict (?!) The amount of mail sent to all of our clients served by QQ, so as not to send many letters to the same users. How do you imagine that? I do not. Or contact SendGrid for help, which we did.
SendGrid replied that, unfortunately, the QQ page is in Chinese (that is, it was as if they first learned about such a problem from us, and they still do not know about the existence of online translators). They also said that we ourselves need to contact QQ and send them our outgoing IP address, and ask to remove restrictions on incoming letters from us. SendGrid also suggested buying additional IP addresses for $ 20 per month so that some of the mail would be sent from them. A “good” solution, but what is the probability of blocking by IP from QQ (maybe they block by domain)? That was the end of the matter.
I wrote in QQ, but no one answered from there and nothing has changed on their part. I entered the domain of this client in the white list in SendGrid so that errors like bounce do not add the recipient's mail to the stop lists. As you know, this did not solve the problem. We just try to send mail further to QQ users after receiving "550 Connection frequency limited" in the hope that at least something will come when the QQ counter is reset to zero for this user.
It was this and other cases that led us to the decision to keep the status of sending mail for analysis and future automation of problem solving. It became clear that you need to collect information about the status of sent letters in your database, and then return to the question of automating the solution of mail problems on your own.
What conclusions can be drawn from the situation described above:
1) Sendrgid does not protect you from blocking on the side of the user's mail server. If your customers complain that they are not receiving notifications, check if their ESP has restrictions like QQ does.
2) Sendgrid workers do not know Chinese
3) When sending letters via Sendgrid, you must use the Event webhook or regularly check your account in the service in order to quickly respond to placing your users in the Bounced list.