
Fighting robots. Explanations
After reading the notes about the beginning of my struggle with robots , many suggested using standard tools like captcha or input fields hidden by CSS properties. There were also non-standard options, such as recognition of visual images (is this a cat or a dog shown in the picture?), Analysis of mouse movement on the page before sending a comment or confusion in the field names.
I like some of these ways, some not. Say, despite the effectiveness of filtering spam using captcha, its use seems to me inappropriate, as well as the use of pattern recognition. Estimate yourself - the user each time has to prove that he is not a camel. I would be offended in his place.
Other actions to draw a line between robots and living people, such as analyzing the movement of the mouse on the page before sending a comment, seem to me too complicated. With the same result, you can check whether the robot supports Javascript or not. In this case, in any case, users with Javascript disabled remain overboard. This is not good, despite their small number. Also, do not forget that emulating scripts on the client side is very simple, and therefore, they are not suitable for serious security systems.
But the methods that are invisible to a simple user, I consider the most successful. In small projects where protection from specialized robots is not required, it is worth using such solutions. This is hiding (display: none) of the special input field “for spammers” - the average user will not see it, and the robot will write something to it. You can also organize confusion in the names of the fields - call the field with the postal address “name”. As a result, the name of the spammer will be driven into it and it will not pass validation.
The methods described in the last group are good primarily because the user does not prove to anyone that he is not a spammer - spammers themselves give themselves away. In the worst case, the user each time will have to agree that it is not a camel - do not fill out an empty field for spammers. It seems to me that such an approach will become more popular over time than using exclusively captcha, despite the effectiveness of the latter.
Already now, in large services, captcha often appears for screening potentially robotic users - for example, those that from the 1st attempt could not remember their password. Or those who leave more than 5 comments in 5 minutes. Those. the presumption of innocence works - the user is considered a person until he does something that is not characteristic of man, but peculiar to a robot. You can develop a similar attitude towards users, taking for analysis not only current actions, but also the entire accumulated history of this user.
Unfortunately, I do not have detailed statistics on robots and their behavior. This is a good target for research. Maybe even I will do it. If there were such statistics, then we could talk about robots more specifically, indicating specific numbers. Agree - it would be interesting to know the percentage of robots that can circumvent a certain type of protection. Or possessing unique skills (for example, those who can handle Javascript or disguise themselves as living people when moving around the site). This would allow a more effective policy to protect sites from spam.
I want to draw an analogy. As casinos are watching visitors without interfering with their game and trying to calculate cheaters based on an analysis of their behavior, so I want to see site protection systems that detect spam bots, focusing not only on its momentary actions, but on the history of its actions in the past and behavioral features. I think, possessing such information, it is possible to identify robots (including specialized ones) quite effectively, without interfering with simple users.
taken from my blog
I like some of these ways, some not. Say, despite the effectiveness of filtering spam using captcha, its use seems to me inappropriate, as well as the use of pattern recognition. Estimate yourself - the user each time has to prove that he is not a camel. I would be offended in his place.
Other actions to draw a line between robots and living people, such as analyzing the movement of the mouse on the page before sending a comment, seem to me too complicated. With the same result, you can check whether the robot supports Javascript or not. In this case, in any case, users with Javascript disabled remain overboard. This is not good, despite their small number. Also, do not forget that emulating scripts on the client side is very simple, and therefore, they are not suitable for serious security systems.
But the methods that are invisible to a simple user, I consider the most successful. In small projects where protection from specialized robots is not required, it is worth using such solutions. This is hiding (display: none) of the special input field “for spammers” - the average user will not see it, and the robot will write something to it. You can also organize confusion in the names of the fields - call the field with the postal address “name”. As a result, the name of the spammer will be driven into it and it will not pass validation.
The methods described in the last group are good primarily because the user does not prove to anyone that he is not a spammer - spammers themselves give themselves away. In the worst case, the user each time will have to agree that it is not a camel - do not fill out an empty field for spammers. It seems to me that such an approach will become more popular over time than using exclusively captcha, despite the effectiveness of the latter.
Already now, in large services, captcha often appears for screening potentially robotic users - for example, those that from the 1st attempt could not remember their password. Or those who leave more than 5 comments in 5 minutes. Those. the presumption of innocence works - the user is considered a person until he does something that is not characteristic of man, but peculiar to a robot. You can develop a similar attitude towards users, taking for analysis not only current actions, but also the entire accumulated history of this user.
Unfortunately, I do not have detailed statistics on robots and their behavior. This is a good target for research. Maybe even I will do it. If there were such statistics, then we could talk about robots more specifically, indicating specific numbers. Agree - it would be interesting to know the percentage of robots that can circumvent a certain type of protection. Or possessing unique skills (for example, those who can handle Javascript or disguise themselves as living people when moving around the site). This would allow a more effective policy to protect sites from spam.
I want to draw an analogy. As casinos are watching visitors without interfering with their game and trying to calculate cheaters based on an analysis of their behavior, so I want to see site protection systems that detect spam bots, focusing not only on its momentary actions, but on the history of its actions in the past and behavioral features. I think, possessing such information, it is possible to identify robots (including specialized ones) quite effectively, without interfering with simple users.
taken from my blog