Under vigilant supervision: how to monitor hosters' tariffs and maintain the relevance of the VPS catalog

    Back in 2013, I wrote the VPS Search site , and by the time of the first publication on Habré, the catalog contained information about 150 hosts and more than 1,200 tariffs. Adding such an array of information did not take a lot of time, since in the first version of the site I showed only the basic tariff parameters (cost, disk size, RAM, processor frequency, traffic amount, country of placement and type of virtualization). All this information was presented on the hosting sites, so apart from the routine copying of this data into the admin panel of the site, nothing needed to be done.



    At the end of January 2014, I presented the second version.VPS search, and the number of parameters for each tariff has greatly increased, the following have been added: disk type, list of installed operating systems, payment methods, availability of control panel included in the price, type of administration, IPv6 support. All this information was not on the hosts' websites, so I had to ask each of the hosts in the catalog to update their tariffs, which took quite a lot of time.

    However, as it turned out, it was not the most difficult to fill the catalog: the hosters' tariffs were constantly changing, and it was necessary to quickly respond to these changes and make edits.

    In my opinion, changes in tariffs should be monitored very carefully, as irrelevant data harm the reputation of the catalog. For example, a user went to the site, found some kind of tariff for 100 rubles, went to the website to the hoster and saw that this tariff costs 150 rubles. The user may think that the information in the directory is very outdated, and no longer use it. To eliminate such a situation, you need to update the information as quickly and correctly as possible.

    At the time of writing the catalog, I assumed the following model of updating information: hosters, having registered in their personal account, will be able to independently monitor their tariffs. If the hoster does not have such an opportunity, then I will monitor the rates myself. However, afterwards I decided to refuse the provision of a personal account to hosters, and, in my opinion, this was a very correct decision. If you provide hosters to edit the tariffs themselves, then a number of inconveniences arise:

    • it will be necessary to moderate the changes that the hoster made: by accident or on purpose, but the hoster may well incorrectly indicate the tariff price (by default, sorting is done at a price), and thus rise to the first positions;
    • it is also unclear whether to hide the tariffs before moderation: if you hide, then the moderation will need to be carried out promptly, so that the hoster’s tariffs are available to visitors, and if you use post-moderation, there is a risk that users will show false information;
    • the hoster can tritely forget to change the description of the tariffs, since there are a lot of similar directories and it is difficult for hosters to keep track of everyone. Also, this can happen if the person responsible for updating the information retired / went on vacation / became ill and did not transfer access to the new employee.

    Over time, I came to the conclusion that you need to monitor the update of tariffs on your own, and periodically make a check on all hoster tariffs on your own. Although this requires a lot of time and resources, but with this approach, tariffs should be as relevant as possible. This approach has a minus in terms of resources spent: the majority of hosters tariffs change quite rarely, but you still need to check them, so in most cases the editors look at the tariffs and do not find any significant changes. The main disadvantage in terms of tariff actualization is the impossibility of instantly reacting to changes. That is, if we check once a month, and at some hoster checked the tariffs of the 1st, and on the 2nd of the hoster updated them, then for almost a month we will show irrelevant data. However, this problem has been partially solved:

    Understanding that when irrelevant tariffs are on the website for a month, this greatly harms the reputation, I began to look for options to correct this situation and automate the check.

    The very first thing I tried to do was to check the pages of hosts sites for changes and started from the main page of each site. The idea was as follows: if something changed on the site of the host, a signal is sent to the editors, and they manually check what has changed, and if necessary, edit the tariffs. In this way, you could quickly follow the changes. Yes, in this case there would be a lot of false positives, when the hoster changed not the characteristics of the tariff, but something else, however, in my opinion, it is better to perform the check and not find anything than to skip some change. The script was written fairly quickly, and the principle of its operation is extremely simple: download the html page, take md5 and save it to the database. After a day, we repeat the procedure and compare it with the already saved value - if they do not match, then there are changes on the page, and you need to manually see what has changed. This option seemed to me ideal in terms of convenience: no need to write any parsers of html code - just looking for changes and that's it. However, the next day after the launch of the check, the editors received a notification that most hosters had changes. It was a bit strange, and I decided to check out because of this happened. The reason turned out to be trivial, and she completely rejected my idea of ​​checking changes in html code: very many hosters on the pages had data generated automatically (for example, information about the page generation time, etc.), so the idea was broken about harsh reality, and this verification method did not work. No need to write any parsers for html code - just looking for changes and that's it. However, the next day after the launch of the check, the editors received a notification that most hosters had changes. It was a bit strange, and I decided to check out because of this happened. The reason turned out to be trivial, and she completely rejected my idea of ​​checking changes in html code: very many hosters on the pages had data generated automatically (for example, information about the page generation time, etc.), so the idea was broken about harsh reality, and this verification method did not work. No need to write any parsers for html code - just looking for changes and that's it. However, the next day after the launch of the check, the editors received a notification that most hosters had changes. It was a bit strange, and I decided to check out because of this happened. The reason turned out to be trivial, and she completely rejected my idea of ​​checking changes in html code: very many hosters on the pages had data generated automatically (for example, information about the page generation time, etc.), so the idea was broken about harsh reality, and this verification method did not work.

    The second idea that came to my mind is checking for changes through the API billing. However, there are also disadvantages. Firstly, only BILLmanager billing has API, which is not so bad, since this billing uses about 40% of hosters. The second problem is the non-standardized description of the tariff characteristics in billing, which again causes a situation where you just need to monitor whether something has changed, and if it has changed, to manually see what it is and whether you need to make edits. Plus, it is not clear what to do with the WHMCS billing, which is also used by many hosters, the less popular RootPanel and BPanel, as well as other billing programs that the hosters themselves wrote.

    Unfortunately, I haven’t found the perfect solution so far, so at the moment we act as follows: the tariffs are constantly checked by the editors, and in the event of any changes, we check the information with the support service and find out the actual characteristics.

    Once a year we do a “big” check and re-ask all hosters the same questions that were asked during the initial addition, clarifying whether any of the parameters have changed.

    From time to time we find factual errors on the hosts sites (for example, it often happens that outdated information remains on part of the site’s pages after changing tariffs), and we try to notify them immediately.

    In test mode, the scripts work to check the sites of those hosters that do not have automatically generated information on the site, so at some hosters we can respond very quickly to changes. Also in test mode, the checker for billing works, and for those hosters who can do this, we also try to follow the changes automatically.

    This approach allows you to quickly respond to what is happening and maintain a high degree of reliability of the information in the catalog. For example, we receive no more than 2-3 complaints per month about errors in the description, and most often they are related to the fact that the user could not find the tariff on the website of the hoster.

    It is still far to perfection, since we cannot get away from checking manually. I would be glad if someone can advise more options for automating such a difficult task.


    Go to VPS.today - a site for searching virtual servers. 1500 tariffs from 130 hosters, convenient interface and a large number of criteria for finding the best virtual server.


    Also popular now: