Once Again About Finding Primes
This note discusses sieve algorithms for finding prime numbers. We will take a detailed look at the classical sieve of Eratosthenes, the features of its implementation in popular programming languages, parallelization and optimization, and then describe the more modern and faster Atkin sieve. If the material about the sieve of Eratosthenes is intended primarily to save beginners from regular walking on the rake, then the Atkin sieve algorithm was not previously described on Habrahabr.
The picture shows the sculpture of abstract expressionist Mark Di Suvero "The Sieve of Eratosthenes", installed on the campus of Stanford University
Recall that a number is called prime if it has exactly two different divisors: one and itself. Numbers having a greater number of divisors are called compound. Thus, if we can decompose numbers into factors, then we can also check numbers for simplicity. For example, something like this:
The running time of such a test is obviously O ( n ½ ), that is, it grows exponentially with respect to the bit length n . This test is called divisor search.
Quite unexpectedly, there are a number of ways to check the simplicity of a number without finding its divisors. If the polynomial factorization algorithm remains an unattainable dream (on which the RSA encryption strength is based), then the AKS simplicity test [ 1 ] developed in 2004 fulfills in polynomial time. Various effective simplicity tests can be found in [2].
If now we need to find all primes on a fairly wide interval, then the first impulse will probably be to test each number from the interval individually. Fortunately, if we have enough memory, you can use faster (and simpler) sieve algorithms. In this article we will discuss two of them: the classical sieve of Eratosthenes, known even by the ancient Greeks, and the Atkin sieve, the most advanced modern algorithm of this family.
The ancient Greek mathematician Eratosthenes proposed the following algorithm for finding all primes that do not exceed a given number n. Take an array S of length n and fill it with units ( mark it as unframed ). Now we will sequentially look at the elements of S [ k ], starting with k = 2. If S [ k ] = 1, then fill in with zeros ( cross out or weave out ) all subsequent cells whose numbers are multiples of k. As a result, we get an array in which cells contain 1 if and only if the cell number is a prime.
You can save a lot of time if you notice that, since the composite number is smallern, at least one of the divisors does not exceed , the sowing process is enough to finish on . Here is an animation of the sieve of Eratosthenes, taken from Wikipedia:
A few more operations can be saved if, for the same reason, you begin to cross out multiples of k, starting not from 2k, but from the number k 2 .
The implementation will take the following form:
The effectiveness of the sieve of Eratosthenes is caused by the extreme simplicity of the inner cycle: it does not contain conditional transitions, as well as “heavy” operations like division and multiplication.
We estimate the complexity of the algorithm. The first strike out requires n / 2 actions, the second - n / 3, the third - n / 5, etc. According to the Mertens formula
so for the sieve of Eratosthenes, O ( n log log n ) operations will be required . The memory consumption will be O ( n ).
Eratosthenes himself proposed the first optimization of the sieve: since out of all the even numbers, only 2 is simple, then let's save half the memory and time and we will write out and sow only the odd numbers. The implementation of such a modification of the algorithm will require only cosmetic changes ( code ).
More advanced optimization (the so-called wheel factorization ) is based on the fact that all simple ones, except 2, 3, and 5, lie in one of the eight following arithmetic progressions: 30 k +1, 30 k +7, 30 k +11, 30 k +13, 30 k +17, 30 k +19, 30 k +23 and 30 k +29. To find all primes up to n, pre-calculate (again with the help of a sieve) all simple ones up to . Now we compose eight sieves, each of which will include elements of the corresponding arithmetic progression, smaller than n , and we will sow each of them in a separate stream. That's it, you can reap the benefits: we not only reduced memory consumption and CPU load (four times compared to the basic algorithm), but also parallelized the operation of the algorithm.
By increasing the progression step and the number of sieves (for example, with a progression step of 210, we need 48 sieves, which will save another 4% of resources) in parallel with the growth of n , it is possible to increase the algorithm speed by log log n times.
What to do if, in spite of all our tricks, there is not enough RAM and the algorithm “swaps” shamelessly? You can replace one large sieve with a sequence of small strainers and sow each individually. As above, we will have to pre-prepare a list of primes before that takes O ( n ½-ε ) additional memory. We don’t need to store the simple ones found in the process of sowing the sieves - we will immediately give them to the output stream.
No need to make the strainers too small, fewer of the same O ( n ½-ε ) elements. So you won’t gain anything in the asymptotic behavior of memory consumption, but because of overhead you’ll start to lose more and more in performance.
Habrahabr previously published a large selection of Eratosthenes algorithms in one line in different programming languages (single-line # 10). Interestingly, all of them are actually not the Eratosthenes sieve and implement much slower algorithms.
The fact is that filtering the set by condition (for example,
Single line
In general, the sieve of Eratosthenes is difficult to effectively implement within the framework of the functional paradigm of immutable variables. In the event that a functional language (for example, OSaml) allows, it is worth breaking the rules and starting a mutable array. In [ 3 ], it is discussed how to correctly implement the sieve of Eratosthenes on Haskell using the technique of lazy strikeouts.
Let's write the Eratosthenes algorithm in PHP. You get something like the following:
There are two problems here. The first of these is related to the data type system and is characteristic of many modern languages. The fact is that the sieve of Eratosthenes is most efficiently realized in those PLs where it is possible to declare a homogeneous array sequentially located in memory. Then the calculation of the address of the cell S [i] will take only a couple of processor instructions. An array in PHP is actually a heterogeneous dictionary, that is, it is indexed by arbitrary strings or numbers and may contain data of various types. In this case, finding S [i] becomes a much more time-consuming task.
The second problem: arrays in PHP are terrible in terms of memory overhead. On my 64-bit system, each element of $ S from the code above eats 128 bytes. As discussed above, it is not necessary to immediately store the entire sieve in memory, you can process it portionwise, but all the same, such costs should be recognized as unacceptable.
To solve these problems, it is enough to choose a more suitable data type - a string!
Now each element takes exactly 1 byte, and the runtime has decreased by about a factor of three. A script for measuring speed .
In 1999, Atkin and Bernstein proposed a new method of sowing composite numbers, called the Atkin sieve. It is based on the following theorem.
Theorem. Let n be a positive integer that cannot be divisible by any full square. Then
From elementary number theory it follows that all simple, large 3, have the form 12 k +1 (case 1), 12 k +5 (again 1), 12 k +7 (case 2) or 12 k +11 (case 3) .
To initialize the algorithm, fill the sieve S with zeros. Now for each pair ( x , y ), where , we increment the values in the cells S [4 x 2 + y 2 ], S [3 x 2 + y 2 ], and also, if x > y , then in S [3 x 2 -y 2 ]. At the end of the calculations, the numbers of cells of the form 6 k ± 1 containing odd numbers are either primes or are divided into squares of primes.
As a final step, we will go through the supposedly simple numbers sequentially and cross out the multiples of their squares.
The description shows that the complexity of the Atkin sieve is proportional to n rather than n log log n as in the Eratosthenes algorithm.
Author's, optimized implementation in C is presented in the form of primegen , a simplified version is on Wikipedia . Atkin's sieve on C # was published on Habrahabr .
As in the Eratosthenes sieve, using wheel factorization and segmentation, we can reduce the asymptotic complexity in log log n times, and memory consumption - up to O ( n ½ + o (1) ).
In fact, the factor log log n grows extremely. slow. For example, log log 10 10000 ≈ 10. Therefore, from a practical point of view, it can be assumed to be constant, and the complexity of the Eratosthenes algorithm is linear. If only the search for simple ones is not a key function in your project, you can use the basic version of the sieve of Eratosthenes (save some even numbers) and not be complex about this. However, when searching for simple ones at large intervals (from 2 32 ), the game is worth the candle, Atkin’s optimizations and sieve can significantly increase productivity.
PS The comments reminded about Sundaram sieve. Unfortunately, it is only a mathematical curiosity and is always inferior either to the sieves of Eratosthenes and Atkin, or to checking by exhaustive divisors.
[1] M. Agrawal, N. Kayal, Saxena N. PRIMES is in P . - Annals of Math. 160 (2), 2004 .-- P. 781–793.
[2] Vasilenko O. N. Number-theoretic algorithms in cryptography. - M., ICMMO, 2003 .-- 328 p.
[3] O'Neill ME The Genuine Sieve of Eratosthenes . - 2008.
[4] Atkin AOL, Bernstein DJ Prime sieves using binary quadratic forms . - Math. Comp. 73 (246), 2003 .-- P. 1023-1030.
The picture shows the sculpture of abstract expressionist Mark Di Suvero "The Sieve of Eratosthenes", installed on the campus of Stanford University
Introduction
Recall that a number is called prime if it has exactly two different divisors: one and itself. Numbers having a greater number of divisors are called compound. Thus, if we can decompose numbers into factors, then we can also check numbers for simplicity. For example, something like this:
function isprime(n){
if(n==1) // 1 - не простое число
return false;
// перебираем возможные делители от 2 до sqrt(n)
for(d=2; d*d<=n; d++){
// если разделилось нацело, то составное
if(n%d==0)
return false;
}
// если нет нетривиальных делителей, то простое
return true;
}
(Hereinafter, unless otherwise specified, a JavaScript-like pseudo-code is given) The running time of such a test is obviously O ( n ½ ), that is, it grows exponentially with respect to the bit length n . This test is called divisor search.
Quite unexpectedly, there are a number of ways to check the simplicity of a number without finding its divisors. If the polynomial factorization algorithm remains an unattainable dream (on which the RSA encryption strength is based), then the AKS simplicity test [ 1 ] developed in 2004 fulfills in polynomial time. Various effective simplicity tests can be found in [2].
If now we need to find all primes on a fairly wide interval, then the first impulse will probably be to test each number from the interval individually. Fortunately, if we have enough memory, you can use faster (and simpler) sieve algorithms. In this article we will discuss two of them: the classical sieve of Eratosthenes, known even by the ancient Greeks, and the Atkin sieve, the most advanced modern algorithm of this family.
Sieve of Eratosthenes
The ancient Greek mathematician Eratosthenes proposed the following algorithm for finding all primes that do not exceed a given number n. Take an array S of length n and fill it with units ( mark it as unframed ). Now we will sequentially look at the elements of S [ k ], starting with k = 2. If S [ k ] = 1, then fill in with zeros ( cross out or weave out ) all subsequent cells whose numbers are multiples of k. As a result, we get an array in which cells contain 1 if and only if the cell number is a prime.
You can save a lot of time if you notice that, since the composite number is smallern, at least one of the divisors does not exceed , the sowing process is enough to finish on . Here is an animation of the sieve of Eratosthenes, taken from Wikipedia:
A few more operations can be saved if, for the same reason, you begin to cross out multiples of k, starting not from 2k, but from the number k 2 .
The implementation will take the following form:
function sieve(n){
S = [];
S[1] = 0; // 1 - не простое число
// заполняем решето единицами
for(k=2; k<=n; k++)
S[k]=1;
for(k=2; k*k<=n; k++){
// если k - простое (не вычеркнуто)
if(S[k]==1){
// то вычеркнем кратные k
for(l=k*k; l<=n; l+=k){
S[l]=0;
}
}
}
return S;
}
The effectiveness of the sieve of Eratosthenes is caused by the extreme simplicity of the inner cycle: it does not contain conditional transitions, as well as “heavy” operations like division and multiplication.
We estimate the complexity of the algorithm. The first strike out requires n / 2 actions, the second - n / 3, the third - n / 5, etc. According to the Mertens formula
so for the sieve of Eratosthenes, O ( n log log n ) operations will be required . The memory consumption will be O ( n ).
Optimization and parallelization
Eratosthenes himself proposed the first optimization of the sieve: since out of all the even numbers, only 2 is simple, then let's save half the memory and time and we will write out and sow only the odd numbers. The implementation of such a modification of the algorithm will require only cosmetic changes ( code ).
More advanced optimization (the so-called wheel factorization ) is based on the fact that all simple ones, except 2, 3, and 5, lie in one of the eight following arithmetic progressions: 30 k +1, 30 k +7, 30 k +11, 30 k +13, 30 k +17, 30 k +19, 30 k +23 and 30 k +29. To find all primes up to n, pre-calculate (again with the help of a sieve) all simple ones up to . Now we compose eight sieves, each of which will include elements of the corresponding arithmetic progression, smaller than n , and we will sow each of them in a separate stream. That's it, you can reap the benefits: we not only reduced memory consumption and CPU load (four times compared to the basic algorithm), but also parallelized the operation of the algorithm.
By increasing the progression step and the number of sieves (for example, with a progression step of 210, we need 48 sieves, which will save another 4% of resources) in parallel with the growth of n , it is possible to increase the algorithm speed by log log n times.
Segmentation
What to do if, in spite of all our tricks, there is not enough RAM and the algorithm “swaps” shamelessly? You can replace one large sieve with a sequence of small strainers and sow each individually. As above, we will have to pre-prepare a list of primes before that takes O ( n ½-ε ) additional memory. We don’t need to store the simple ones found in the process of sowing the sieves - we will immediately give them to the output stream.
No need to make the strainers too small, fewer of the same O ( n ½-ε ) elements. So you won’t gain anything in the asymptotic behavior of memory consumption, but because of overhead you’ll start to lose more and more in performance.
Eratosthenes sieve and single line
Habrahabr previously published a large selection of Eratosthenes algorithms in one line in different programming languages (single-line # 10). Interestingly, all of them are actually not the Eratosthenes sieve and implement much slower algorithms.
The fact is that filtering the set by condition (for example,
primes = primes.select { |x| x == prime || x % prime != 0 }
in Ruby) or using aka list comprehensions generator lists (e.g. let pgen (p:xs) = p : pgen [x|x <- xs, x `mod` p > 0]
on Haskell) they call exactly what the sieve algorithm is called to avoid, namely, element-by-element verification of divisibility. As a result, the complexity of the algorithm increases at least to (this is the number of filtrations) multiplied by (the minimum number of elements of the filtered set), where is the number of simple, not exceeding n , i.e., up to O ( n 3/2-ε ) actions. Single line
(n: Int) => (2 to n) |> (r => r.foldLeft(r.toSet)((ps, x) => if (ps(x)) ps -- (x * x to n by x) else ps))
on Scala is closer to the Eratosthenes algorithm in that it avoids the divisibility check. However, the complexity of constructing the difference of sets is proportional to the size of the larger of them, so that the result is the same O ( n 3/2-ε ) operations. In general, the sieve of Eratosthenes is difficult to effectively implement within the framework of the functional paradigm of immutable variables. In the event that a functional language (for example, OSaml) allows, it is worth breaking the rules and starting a mutable array. In [ 3 ], it is discussed how to correctly implement the sieve of Eratosthenes on Haskell using the technique of lazy strikeouts.
Sieve of Eratosthenes and PHP
Let's write the Eratosthenes algorithm in PHP. You get something like the following:
define("LIMIT",1000000);
define("SQRT_LIMIT",floor(sqrt(LIMIT)));
$S = array_fill(2,LIMIT-1,true);
for($i=2;$i<=SQRT_LIMIT;$i++){
if($S[$i]===true){
for($j=$i*$i; $j<=LIMIT; $j+=$i){
$S[$j]=false;
}
}
}
There are two problems here. The first of these is related to the data type system and is characteristic of many modern languages. The fact is that the sieve of Eratosthenes is most efficiently realized in those PLs where it is possible to declare a homogeneous array sequentially located in memory. Then the calculation of the address of the cell S [i] will take only a couple of processor instructions. An array in PHP is actually a heterogeneous dictionary, that is, it is indexed by arbitrary strings or numbers and may contain data of various types. In this case, finding S [i] becomes a much more time-consuming task.
The second problem: arrays in PHP are terrible in terms of memory overhead. On my 64-bit system, each element of $ S from the code above eats 128 bytes. As discussed above, it is not necessary to immediately store the entire sieve in memory, you can process it portionwise, but all the same, such costs should be recognized as unacceptable.
To solve these problems, it is enough to choose a more suitable data type - a string!
define("LIMIT",1000000);
define("SQRT_LIMIT",floor(sqrt(LIMIT)));
$S = str_repeat("\1", LIMIT+1);
for($i=2;$i<=SQRT_LIMIT;$i++){
if($S[$i]==="\1"){
for($j=$i*$i; $j<=LIMIT; $j+=$i){
$S[$j]="\0";
}
}
}
Now each element takes exactly 1 byte, and the runtime has decreased by about a factor of three. A script for measuring speed .
Atkin sieve
In 1999, Atkin and Bernstein proposed a new method of sowing composite numbers, called the Atkin sieve. It is based on the following theorem.
Theorem. Let n be a positive integer that cannot be divisible by any full square. Then
- if n is representable in the form 4 k +1, then it is simple if and only if the number of natural solutions of the equation 4 x 2 + y 2 = n is odd.
- if n can be represented as 6 k +1, then it is simple if and only if the number of natural solutions of the equation 3 x 2 + y 2 = n is odd.
- if n is representable as 12 k -1, then it is simple if and only if the number of natural solutions of the equation 3 x 2 - y 2 = n , for which x > y , is odd.
From elementary number theory it follows that all simple, large 3, have the form 12 k +1 (case 1), 12 k +5 (again 1), 12 k +7 (case 2) or 12 k +11 (case 3) .
To initialize the algorithm, fill the sieve S with zeros. Now for each pair ( x , y ), where , we increment the values in the cells S [4 x 2 + y 2 ], S [3 x 2 + y 2 ], and also, if x > y , then in S [3 x 2 -y 2 ]. At the end of the calculations, the numbers of cells of the form 6 k ± 1 containing odd numbers are either primes or are divided into squares of primes.
As a final step, we will go through the supposedly simple numbers sequentially and cross out the multiples of their squares.
The description shows that the complexity of the Atkin sieve is proportional to n rather than n log log n as in the Eratosthenes algorithm.
Author's, optimized implementation in C is presented in the form of primegen , a simplified version is on Wikipedia . Atkin's sieve on C # was published on Habrahabr .
As in the Eratosthenes sieve, using wheel factorization and segmentation, we can reduce the asymptotic complexity in log log n times, and memory consumption - up to O ( n ½ + o (1) ).
About the logarithm of the logarithm
In fact, the factor log log n grows extremely. slow. For example, log log 10 10000 ≈ 10. Therefore, from a practical point of view, it can be assumed to be constant, and the complexity of the Eratosthenes algorithm is linear. If only the search for simple ones is not a key function in your project, you can use the basic version of the sieve of Eratosthenes (save some even numbers) and not be complex about this. However, when searching for simple ones at large intervals (from 2 32 ), the game is worth the candle, Atkin’s optimizations and sieve can significantly increase productivity.
PS The comments reminded about Sundaram sieve. Unfortunately, it is only a mathematical curiosity and is always inferior either to the sieves of Eratosthenes and Atkin, or to checking by exhaustive divisors.
Literature
[1] M. Agrawal, N. Kayal, Saxena N. PRIMES is in P . - Annals of Math. 160 (2), 2004 .-- P. 781–793.
[2] Vasilenko O. N. Number-theoretic algorithms in cryptography. - M., ICMMO, 2003 .-- 328 p.
[3] O'Neill ME The Genuine Sieve of Eratosthenes . - 2008.
[4] Atkin AOL, Bernstein DJ Prime sieves using binary quadratic forms . - Math. Comp. 73 (246), 2003 .-- P. 1023-1030.