Python random number security

  • Tutorial
This article is the second in a series of publications on the vulnerabilities of pseudo random number generators (PRNGs).

Recently, a number of publications have appeared that describe the vulnerabilities of the PRNG, from the very foundations ([1]) to the vulnerabilities in various programming languages ​​and based on them CMS and other software ([2], [3], [4] )

These publications are popular because the PRNG is at the core of many aspects of web application security. Pseudorandom numbers / character sequences are used to secure web applications in:

  • generation of various tokens (CSRF, password reset tokens, etc.);
  • random password generation;
  • text generation in CAPTCHA;
  • generating session identifiers.

In a previous article , based on research by George Argyros and Aggelos Kiayias ([3]), we learned to predict random numbers in PHP based on PHPSESSID and reduce the entropy of pseudorandom numbers in various ways.

We will now look at the PRNG in web applications developed in Python.

Python PRNG Features

There are three modules in Python designed to generate random / pseudo random numbers: random , urandom and _random :

  • _random implements the Mersenne Twister (MT) algorithm ( [6], [7] ) with minor modifications in the C language;
  • urandom uses external sources of entropy (Windows cryptographic provider in the CryptGenRandom function) in C language;
  • random is a wrapper for the _random module in Python that combines both libraries and has two main functions for generating pseudorandom numbers: random () and SystemRandom ().


The first uses the MT algorithm ( _random module ), but first of all it tries to initialize it using SEED taken from urandom , which turns the PRNG into the RNG (random number generator). If you cannot call urandom (for example, / dev / urandom is missing or you cannot call the desired function from the advapi32.dll library ), then int (time.time () * 256) is used as SEED (which, as you already know, provides insufficient entropy).


SystemRandom () calls the urandom module , which uses external sources to generate random data.

Changes in the implementation of the MT algorithm is that instead of a single number based on one of 624 numbers from the current state of the PRNG state , two numbers are used:

    unsigned long a=genrand_int32(self)>>5, b=genrand_int32(self)>>6;
    return PyFloat_FromDouble((a*67108864.0+b)*(1.0/9007199254740992.0));

Also, unlike PHP, the generator can be initialized not only using a long variable, but also using any sequence of bytes ( init_by_array () is called ) , which happens when importing a random module using an external entropy source (32 bytes are taken from urandom ), and in the case when this fails, time () is used :

if a is None: try:
                a = int.from_bytes(_urandom(32), 'big')
            except NotImplementedError:
                import time
a = int(time.time() * 256)


It would seem that these changes, unlike PHP, provide sufficient entropy of the generator even when calling random.random () . But it is not all that bad".

A feature of Python frameworks is that, unlike PHP, Python runs once with the web server, which means that initialization of the default state occurs once when the import random command is executed or when random.seed () is forced (this is extremely a rare case for web applications), which allows an attack on MT states using the following algorithm:

  • we find a script that displays the value random.random () (for example, Plone deals with the error logger (file ), it displays the page " error with number *** fixed ", which reflects a random number);
  • execute a series of queries sequentially, where we fix random numbers. Request numbers: 1,2,199,200,511,625 ;
  • By the 313th request, we make a predictable action (for example, generating a password reset link);
  • based on queries 1,199, we determine the states state_1 [1], state_1 [2], state_1 [397] ;
  • based on 2,200 queries - state_1 [3], state_1 [398] ;
  • based on request 511 - state_2 [397] ;
  • based on request 625 - state_3 [1] .

The accuracy of determining states depends on the index of the state element ( i ): for i mod 2 = 0, the entropy is 2 ^ 6 , for i mod 2 = 1 - 2 ^ 5 .

Further, with the help of queries 1,2,199,200 it is possible to determine the states state_1 [1] , state_1 [2] , state_1 [3] , state_1 [397] , state_1 [398] , based on which the states state_2 [1] and state_2 [2] are generated , from which the random number of request No. 313 is obtained . However, the entropy of this number is 2 ^ 24 (16M). Entropy is reduced using queries 511 and 625 . These queries help compute state_2 [397] , state_3 [1] . This reduces the number of state options to 2 ^ 8 , i.e. there are a total of 256 variants of the "random" number used in the query No. 313 .

A prerequisite for an attack is that no one will wedge in the query process, thereby not affecting the state of the PRNG (in other words, that the state indices will be determined correctly). It is also necessary that request No. 1 use state elements of the PRNG with indices no more than 224 , otherwise request No. 200will use a different state of the generator, which will not allow the execution of the algorithm. The probability of this event is 36% .

Therefore, the additional task of request No. 625 is to determine that all previous requests were actually made in the necessary states and no one got into the request process.

In addition, we bring to your attention a script that receives random numbers of 6 requests. At the output, all possible random numbers of request No. 313 are generated .

Practical use

We have analyzed several Python frameworks and web applications (among them Plone and Django). Unfortunately, (and perhaps fortunately), they could not be found among them vulnerable.

The most likely candidate is Plone, since it has a random number output ( ), but the problem of attacking it is as follows. Plone works under Python 2.7. * , Which, when float is output to str (), truncates the last 5 digits, which extends the number of options that are searched (both for local enumeration and for external requests to the server) to very large numbers.

Python third branch doesn't cut float in st () r function, which makes applications on it the main contenders for attacks.

We bring to your attention a script that receives 6 random numbers at the input (initialized by a state with the necessary indices, for example, from the test script, and at the output it generates possible variants of the selected random number. The running time of this scenario on an average computer is about an hour.

Note: this scenario does not take into account the possible error in determining the state element for (i mod 2 = 1), therefore, the efficiency decreases from 36% to 18%.


The peculiarities of the execution of the Python framework code (on the web server side) allow an attacker to carry out attacks that are impossible or very difficult to implement in the PHP language. To protect the PRNG, simple rules must be followed:

  • use the urandom module or the random.SystemRandom () function;
  • initialize with random.seed () before each call to random.random () with sufficient SEED entropy (if the urandom module is not available, then use, for example, the value of the function md5 (time.time () * (int) salt1 + str (salt2 )) , where salt1 and salt2 are initialized during the installation of the web application);
  • limit the output of random numbers in your web application (just use hash functions like md5 ).


[3] http: //crypto.di.
[5] 10 / presentations / Kamkar / BlackHat-USA-2010-Kamkar-How-I-Met-Your-Girlfriend-slides.pdf
[7] http: //

Posted by Timur Yunusov.

Also popular now: