Unexpected Garbage Collector Session Behavior


    The other day I ran into a very interesting problem. In the system with which I understood, a mechanism was used to limit the session lifetime. Validation of this time was passed on to the shoulders of the garbage collector, who for some reason didn’t do it in good faith, or even didn’t. As it turned out, these are common mistakes, so I would like to tell you about the intricacies of working with GC.

    In php, 3 parameters are responsible for the work of the GC for sessions: session.gc_probability , session.gc_divisor and session.gc_maxlifetime .
    These parameters indicate the following: in gc_probability from gc_divisorstartups session_start is started by the GC, which should clear the session with the last access time greater than gc_maxlifetime .




    We do as everyone, or example No. 1


    Let's try to test the work of GC in a small script:

    We will update this file 10 times with a 10-15 second interval (it is possible and more, it is important that the interval is higher than 1 second). As a result, we get “unexpected answers”:
    0
    1
    2
    3
    ...
    

    The reason is quite simple and, I would say, obvious:
    gc will only start in 1 out of 1000 requests, and we made only 15.

    Note: most of the systems that I saw worked on this algorithm and did not go deeper.

    Bypass the bug at all costs, or example No. 2


    The solution to the problem seems simple - but what if starting the GC is made mandatory?

    But the behavior of this script becomes much more unexpected. Let's try to repeat the same actions as for example No. 1:
    0
    1
    0
    1
    ...
    


    Debriefing, or why it happens


    If we hang up the handlers using session_set_save_handler, we can easily restore the session loading / processing order:
    1. open
    2. read
    3. gc
    4. PROGRAM
    5. close

    Those. garbage collector started after reading the session, which means the $ _SESSION array is already full. This is where the unexpected unit arises in the second example!

    Back to 1st example


    As we now see, the garbage collector may start in step 3, but what happens if it does not start? Indeed, with standard settings, the chance to start is only 1 out of 1000. An
    outdated session will open successfully, read, and at the end of the work it will be saved and the time of the last file access will be updated - in this case, such a session becomes almost endless. But, at the same time, if our script uses 1000 different users, then you can forget about the “infinity” of the session, because GC is most likely to be launched by any of the users, the lifetime will start working correctly (more precisely, almost true). This behavior of the system is ambiguous and unpredictable, and this will potentially lead to a large number of hard-to-catch problems.

    And what to do now, or ways out of the situation


    The surest solution is to use your session validation mechanism. The documentation explicitly states that
    “session.gc_maxlifetime sets the time delay in seconds, after which the data will be considered as garbage and potentially deleted. Garbage collection may occur during the start of a session (depending on the values ​​of session.gc_probability and session.gc_divisor). ” The words “potentially” and “maybe” just mean that gc is not intended to limit the session lifetime. In those places where the session life time is important, and the occurrence of artifacts, as from example No. 2 is critical, use your life time validation.

    Exit number 2, bad and wrong

    We know that the established “forced mode” of gc operation will work at step No. 3 of the start of the session. Those. in fact, after the start of an obsolete session, the data in the $ _SESSION array is present, and the file has already been deleted. In this case, it is logical to try to recreate the session, i.e., actually start 2 session_start starts:

    The results of the script will be:
    0 0
    1
    0 0
    1
    ...
    


    This behavior is clear from the processing order of the session, but (recall the documentation, and generally take an adequate look), this is not worth doing.

    Hooray, sorted out - conclusion


    I was surprised that most, even experienced, developers have never thought about the behavior of GC, carefreely trusting him to limit the session lifetime. Despite the fact that the documentation clearly states that this is not worth doing, the name Garbage Collector (not Session Validator, or Session Expire) speaks for itself. Well, the main conclusion, of course, is that you should carefully check even the seemingly obvious parts of the system. Errors of system functions or methods are sometimes their incorrect interpretation, and not errors as such.

    Thank you all for reading to the end. I hope you find this article helpful.

    Also popular now: