Hormonal holywar Admin and Developer PHP or REMOTE_ADDR vs HTTP_X_FORWARDED_FOR

Recently, he witnessed an interesting debate about how really it is necessary to determine the IP address of the end user from PHP scripts.
Actually, each word of the subject reflects the actual situation. It was a religious debate, aggravated by wonderful spring weather, in which, I believe, there were no right and wrong, but which prompted me to a little research and, to my happiness, put an end to the understanding of this confessional but in fact a very simple question.
For those who, like me, doubted I was sure that I figured it out, but was afraid to ask too lazy to understand the little things - under cat.


Background


When developing a VOD service for the Samsung SmartTV platform, we definitely need to know the country of the user so that we don’t accidentally show the happy user a movie where the copyright holder forbids ... But violation of this contract term does not involve child fines in the thousands of dollars (for which, for every fact oversights).
[The question, as noted in the comments, is legal and fraud possible, but the article is not even about how to try to prevent such fraud, but how to make friends php and nginx correctly]

On the server we have the following: php-fpm + nginx

How to determine the country? Well, of course, through the user's IP and GEO IP base maxmind
"Pfff ...." - we all thought to me - it’s easy. And so as not to write your bike,googled on stackoverflow , even delved into every line, screwed it on and left it there and the code grew:

    public function getUserHostAddress(){
        if (!empty($_SERVER['HTTP_X_REAL_IP']))   //check ip from share internet
        {
            $ip=$_SERVER['HTTP_X_REAL_IP'];
        }
        elseif (!empty($_SERVER['HTTP_CLIENT_IP']))   //check ip from share internet
        {
            $ip=$_SERVER['HTTP_CLIENT_IP'];
        }
        elseif (!empty($_SERVER['HTTP_X_FORWARDED_FOR']))   //to check ip is pass from proxy
        {
            $ip=$_SERVER['HTTP_X_FORWARDED_FOR'];
        }
        else
        {
            $ip=$_SERVER['REMOTE_ADDR'];
        }
        return $ip;
    }


And it worked! Almost a year ... until something unexpected happened. Naturally unexpected for this code ...

How to confuse php or proxy chain (still part of the history)


Everything is broken! And it happened when we had to screw one of the payment systems and this whole code crashed because not one address came to HTTP_X_FORWARDED_FOR, but a list of addresses separated by commas (which is strictly speaking legal, permissible, and not even regulated in the php dock )
AND no one would have noticed anything if HTTP_X_REAL_IP or HTTP_CLIENT_IP (which are not regulated by the dock too) contained the IP they were looking for, but alas they were empty :(

“Well,” we thought (now I was not alone) we’ll rewrite everything and ask the admins to push user IP to the variable REMOTE_ADDR:

    public function getUserHostAddress(){
        $ip=$_SERVER['REMOTE_ADDR'];
        return $ip;
    }


And it worked! Almost a month ... until something unexpected happened. Naturally unexpected for this code ...

Spring argument of tough men (this is not irony - they are cool)


Everything is broken! And this happened because we needed to update nginx. And we turned to the professionals in this matter - to our admins.
And those, in turn, decided to update the config and get rid of our “crutch / not crutch” (until we understood this) with a forward to REMOTE_ADDR.

REMOTE_ADDR left unchanged i.e. there now something like "127.0.0.1"
in HTTP_X_FORWARDED_FOR shone through the IP of the user (who was able to easily override by sending the header `x-forwarded-for: 999.999.999.999` from the browser)
And then it started - P = Develop, A = Admin:

A: everything has broken, and since we have a nginx proxy, the address you need is in HTTP_X_FORWARDED_FOR and in REMOTE_ADDR the real IP address of the client to php-fpm will be located (i.e. 127.0.0.1)
R: but we can’t believe HTTP_X_FORWARDED_FOR, because this is a variable that can be easily redefined through the header to the server, referring to a rather interesting article
A: no, we will make it contain the real IP of the end user, and in REMOTE_ADDR the real address client to php
R: then we will not follow the sequence of proxies, and still for universalization on another server (say without a proxy) these configs may not be true shove everything in REMOTE_ADDR which will work anyway.

... it's brief and without mats ...

As a result, of course, everything started up ... and settled on transparent proxying when php thinks that clients are connected directly to it without any proxies and all the variables (more precisely the one we pay attention to) in the state we need.
However, there is not enough feng shui in this matter, and in fact we have proxy, and maybe not one.

Who is to blame for them? Who is right?


It's not for us to judge, but no one!

If we really have a bunch of clients directly to php, or transparent proxying, then everything is simple - use REMOTE_ADDR for health and enjoy.

But what to do with fans, and where should it be if we use normal proxying and want PHP to know about it?

The recipe ... but not the panacea:


  • REMOTE_ADDR - contains the IP address of nginx directly accessing it, in our case 127.0.0.1
  • HTTP_X_FORWARDED_FOR - contains a chain of proxy addresses and the last is the IP of the direct client accessing the proxy server. And here we consider two special cases:

    • Not cascading proxies. In HTTP_X_FORWARDED_FOR, the last or only IP address (depending on what the user sent / did not send in the x-forwarded-for header) will be the real, searched, that same user address.

      It would seem that the problem is to parse this variable and get the last elementfrom there. But in our case, the settings were not completely correct and the whole HTTP_X_FORWARDED_FOR was replaced by the header from the x-forwarded-for browser, and had to stick the real IP of the direct user to it.

      For example, I checked on industrial vps hosting:

      Trusting such data is also scary, but if everything is done correctly in the settings, then the last IP will be the user's address, regardless of what comes in the headers.

    • Cascading proxies. In this case, really HTTP_X_FORWARDED_FOR - it contains a chain of proxy addresses and the last is the IP of the direct client that turned to the proxy server. But this is not the real IP of the user, but just the IP of the previous proxy in the list.

      It would seem that the problem is to parse this variable and get the first elementfrom there. But as shown in the figure above, this is certainly not correct data and the user can mislead us into two accounts by sending to x-forwarded-for the first element which he wants IP

  • HTTP_X_REAL_IP (or any other variable Admin and Razrab agree to ) - contains the IP address of the user accessing the php or the first untrusted proxy (which is the client address for us)

    For convenience, you can use a special module for nginx which eliminates the problems of determining cascading and non-cascading proxying, but it defaults to “in standard assemblies of centos, debik and fedora nginx goes, for some reason without the parameter --with-http_realip_module” (c) Admin , and the chain in HTTP_X_FORWARDED_FOR and us should be correctly formed for it troeny addresses of trustedproxy servers from which we can take the last element from HTTP_X_FORWARDED_FOR

    However, again HTTP_X_REAL_IP is not a real end-user IP in general , but only the first IP in the list of proxies for cascading proxies.
    Although if proxying is not cascading, then the end-user address may also be there.
    And if proxying is cascaded and the http_realip module is configured correctly, then either the end-user IP or the correct IP of the first untrusted proxy should be there if you count from the php server, which will also work for us
  • HTTP_CLIENT_IP (or any other variable Admin and Razrab agree to ) - contains for any type of proxying the first IP of HTTP_X_FORWARDED_FOR, and in the absence of proxying, the contents of the http header are client-ip. Which can be used for reference only. And in no case to determine the real IP of the user .

In custody


There are several proxy options for php + nginx
  • Transparent - the constant content of variables in _SERVER (including REMOTE_ADDR) is characteristic, as if we were working directly with php
  • Not transparent not cascading - it is characteristic that the Admin and Razrab need to agree on where the real IP address of the user will be stored :)
  • Non-transparent cascading - the same thing is characteristic of a non-transparent non-cascading + correctly configured module for nginx . Also, remember that cascading proxies are possible and that the user is malicious and can send very untrue data to _SERVER ["HTTP_xxxx"]


PS
Later we will feng shui in the settings and get rid of transparent proxying, as well as write a universal function for determining IP for both proxy cases.

PPS
For fun, who cares: if someone in the comments writes this function and the nginx config is for us and we will use it, then honestly, he will get 100 rubles to the phone.
But this function and the config should be truly Orthodox and take everything into account :) all the clues are in the article.
The main thing is Zen: take your time - suddenly the first will write with errors and you will take them into account, hurry up - suddenly the first correct answer will be up to you.

Thanks to all. Have a good spring! Make arrangements with colleagues and love them! :)

UDP:
Its implementation:
 /**
     * @param null|string $ip_param_name - ключ элемента _SERVER, в котором нужно искать IP адрес
     *          если не задано ищем по индексу REMOTE_ADDR и считаем что проксирование отсутствует или прозрачное,
     *          если задано считаем что IP пробрасывается по заданному индексу, 
     *              например по индексу HTTP_X_REAL_IP или любому другому
     * @param bool $allow_non_trusted - защита, при заданном $ip_param_name но 
     *              отсутствующем или не валидном значении _SERVER[$ip_param_name]
     *          если задано будем искать в _SERVER по ключам из аргумента $non_trusted_param_names
     * @param array $non_trusted_param_names - массив ключей, по которым будем искать IP в массиве _SERVER
     * @throws Exception
     * @return string
     */
    public function getUserHostAddress(
        $ip_param_name = null,
        $allow_non_trusted = false,
        array $non_trusted_param_names = array('HTTP_X_REAL_IP','HTTP_CLIENT_IP','HTTP_X_FORWARDED_FOR','REMOTE_ADDR')
    ){
    	if(empty($ip_param_name) || !is_string($ip_param_name)){ 
    	// если не задан или не корректен
            $ip = $_SERVER['REMOTE_ADDR'];
        }else{ 
        //иначе используем нужную переменную
            if(!empty($_SERVER[$ip_param_name]) && filter_var($_SERVER[$ip_param_name], FILTER_VALIDATE_IP)){ 
            // если переменная подошла как надо
                $ip = $_SERVER[$ip_param_name];
            }else if($allow_non_trusted){ 
            // мы решили пойти на крайний шаг и использовать сырые данные
                foreach($non_trusted_param_names as $ip_param_name_nt){
                    if($ip_param_name === $ip_param_name_nt) 
                    // мы уже проверяли эту переменную
                        continue;
                    if(!empty($_SERVER[$ip_param_name_nt]) && filter_var($_SERVER[$ip_param_name_nt], FILTER_VALIDATE_IP)){ 
                    // если переменная подошла как надо
                        $ip = $_SERVER[$ip_param_name_nt];
                        break;
                    }
                }
            }
        }
        if(empty($ip)) 
        // так и не нашли подходящих ip, хотя по умолчанию в $_SERVER['REMOTE_ADDR'] что-то должно лежать
            throw new Exception("Can't detect IP");
        return $ip;
    }

Also popular now: