
How NOT to make an informer from an external site in PHP
Good day to all who read!
I want to make a reservation right away, here I will talk about obvious things for any experienced PHP programmer. But lately, newcomers have constantly come across this error in one or another of its manifestations.
UPD : gentlemen, what a manner of silently spitting in karma! Is it really hard to write what exactly doesn’t like in the post?
A man makes his first (second, third) site. He calls the whole thing an information portal. Getting started is useful. And so the person decides to post an informer from a third-party site on his site. Many sites provide a special service for these purposes. For example, Gismeteo distributes a html code for insertion on its pages, many banks also give a code for an informer with exchange rates. But what if the desired site does not provide such a service?
Here we should make a reservation again. Let us omit the discussion of the legality of posting information from someone else’s site without permission. I do not welcome such actions, but if a person needs ...
So, our newcomer decides to insert a page from the desired URL in the right place. What I see in the source: This is terrible. This is very, very bad. For those who do not understand how terrible this is:
And in conclusion, I’ll tell you how to insert third-party content, I would do it.
PHP has support for such a nice tool as curl, which allows you to pull content from remote web servers, and with very flexible settings, practically allowing you to simulate the browser. Content is placed in a variable and then processed. You can process content using regular expressions, you can parse HTML using XPath or another parser. In any case, you need to get rid of all that is superfluous and leave naked useful content: text, numbers, etc. Then this data is checked for validity and simply inserted into the native page layout.
No design disruptions, no layout breakdowns, no PHP injections.
PS. It is worth noting that my arguments and demonstration of vulnerability did not have the desired effect on a novice colleague, the vulnerability was not fixed. A few days later the site was hacked just using this vulnerability. Do not repeat mistakes, learn from strangers. Good luck
I want to make a reservation right away, here I will talk about obvious things for any experienced PHP programmer. But lately, newcomers have constantly come across this error in one or another of its manifestations.
UPD : gentlemen, what a manner of silently spitting in karma! Is it really hard to write what exactly doesn’t like in the post?
A man makes his first (second, third) site. He calls the whole thing an information portal. Getting started is useful. And so the person decides to post an informer from a third-party site on his site. Many sites provide a special service for these purposes. For example, Gismeteo distributes a html code for insertion on its pages, many banks also give a code for an informer with exchange rates. But what if the desired site does not provide such a service?
Here we should make a reservation again. Let us omit the discussion of the legality of posting information from someone else’s site without permission. I do not welcome such actions, but if a person needs ...
So, our newcomer decides to insert a page from the desired URL in the right place. What I see in the source: This is terrible. This is very, very bad. For those who do not understand how terrible this is:
...
include "http://...";
...
- As a rule, if the remote site does not provide legal code for the informer, then in return we get a full HTML page with its own headers, in its own encoding, etc. At least it will look awful, it will not fit into the overall design of the site. And most likely it just breaks the page layout.
- The include command simply takes the text received at the request of the URL and inserts it into the current location of the program, like the PHP source . This means that if, on the other hand, the site admin gives a specially crafted page with PHP code, this code will execute immediately on your server. This is the most commonplace injection. In front of the astonished newcomer, I made a page that, with this inclusion, reloaded his server. Here you can also say that on most hosting sites, remote inclusion is disabled, and rightly so.
And in conclusion, I’ll tell you how to insert third-party content, I would do it.
PHP has support for such a nice tool as curl, which allows you to pull content from remote web servers, and with very flexible settings, practically allowing you to simulate the browser. Content is placed in a variable and then processed. You can process content using regular expressions, you can parse HTML using XPath or another parser. In any case, you need to get rid of all that is superfluous and leave naked useful content: text, numbers, etc. Then this data is checked for validity and simply inserted into the native page layout.
No design disruptions, no layout breakdowns, no PHP injections.
PS. It is worth noting that my arguments and demonstration of vulnerability did not have the desired effect on a novice colleague, the vulnerability was not fixed. A few days later the site was hacked just using this vulnerability. Do not repeat mistakes, learn from strangers. Good luck