Algorithm for finding N first prime numbers

In the process of teaching programming, I came across the task of finding 1M first primes.

On Habré already more than once wrote about it. But, most of all, it was about finding all prime numbers up to the number of N . I’ll talk about how I solved the problem of finding the N first prime numbers.

At first glance, it might seem that there is practically no difference in these tasks. And this is really so, until we understand that enumerating the divisors does an extremely poor job.

As a result of the work done, it takes me only 0.262 seconds to find 1 million primes, which, of course, is not the limit, but it is already impressive. The algorithm is implemented in C ++.

Like the author of a previous article on prime numbers, I began to solve the problem head-on. So, the prime number p is that it has two divisors - one and p . The first thing that comes to mind is to check each number by dividing by all previous numbers.

The idea turned out to be bad. I realized this after the first 30 minutes of the program in search of a million simple ones. The results were as follows:

10,000 - 2,349 s.
100,000 - 293.552 s
1,000,000 - did not wait

As in the article already mentioned above, I set about optimizing the algorithm. Firstly, it does not make sense to check even numbers, because 2 is the only prime even number. The same with dividers - you can not check for even ones. Thirdly, as divisors it’s enough to take numbers not exceeding half the number checked. Here's what came of it: It got better, but I still could not wait for a million 10,000 - 0.589 s. 100,000 - 72.662 s 1,000,000 - after 20 minutes I stopped waiting

// Listing 1

 

#include 

#include 

#include 

 

using namespace std;

 

int main()

{

    clock_t start = clock();

 

    const int AIM = 10000;

 

    vector prime_nums;

    prime_nums.push_back(2);

 

    for (int num = 3; prime_nums.size() < AIM; num += 2)

    {

        bool isprime = true;

        for (int i = 3; i <= num / 2; i += 2)

        {

            if (num % i == 0)

            {

                isprime = false;

                break;

            }

        }

 

        if (isprime)

            prime_nums.push_back(num);

    }

 

    cout << prime_nums[AIM - 1];

 

    clock_t end = clock();

    cout << "\n----------------" << endl;

    cout << "Time: " << double(end - start) / CLOCKS_PER_SEC << " sec" << endl;

}







It is clear that you can continue to optimize this algorithm for a long time, which is what I basically did. You can not check the numbers ending in 5 and 0. You can make a list of the most common divisors and check each number first on them. At some point, it seemed to me that I could solve the problem by checking new numbers only for primes that were already found. It did not bring significant results. The problem was that with each new number, the number of checks increased, and the division operation is one of the most difficult.

I went in search. I learned about the sieve of Eratosthenes . But I cannot use the algorithm as it is for my task. It’s not known in advance * how many numbers one needs to sift to find Nsimple. Therefore, I decided to modify the algorithm to fit my needs as follows.

Sifting in advance a large set of natural numbers is not profitable. So you need to sift some sets of natural numbers until you reach the desired result. Moreover, each new set must first be sieved for already found primes. Here is the code obtained after a series of optimizations: The results were even slightly surprised: 10,000 - 0.001 s. 100,000 - 0.021 s. 1,000,000 - 0,262 s. ... 10,000,000 - 2,999 s. On this, with the task of finding 1M first primes, I ended up. The result completely satisfied me. But what else can be done? You can implement any of the known algorithms based on the sieve of Eratosthenes, for example, the Sundaram sieve

// Listing 2

 

#include 

#include 

#include 

 

using namespace std;

 

int main()

{

    clock_t start = clock();

 

    const int AIM = 1000000;

 

    int startSize = AIM; // стартовый размер массива натуральных чисел

    int addSize = AIM; // размер дополнительного массива натуральных чисел

 

    vector nums(startSize);

    vector primeNums(AIM);

 

    int foundPrimes = 0;

 

    for (int i = 2; i < startSize; i++)

        nums[i] = true;

 

    bool addition = false;

    int adder = 0;

    while (true)

    {

        if (addition)

        {

            nums.resize(nums.size() + addSize, true);

 

            // исключим составные числа простыми из предыдущих итераций

            for (int i = 0; i < foundPrimes; i++)

            {

                int cur_num = primeNums[i];

                if ((addSize + ((nums.size() - addSize) % cur_num)) < cur_num)

                    continue;

                for (int j = ((nums.size() - addSize) / cur_num) * cur_num; j < nums.size(); j += cur_num)

                    nums[j] = false;

            }

        }

        else

            addition = true;

 

 

        int iter;

        if (foundPrimes == 0)

            iter = 2;

        else

            iter = primeNums[foundPrimes - 1] + 2;

 

        for ( ; iter < nums.size(); iter++)

        {

            // выбираем очередное простое число

            if (nums[iter])

            {

                primeNums[foundPrimes] = iter;

                foundPrimes++;

                if (foundPrimes == AIM)

                    break;

                // просеиваем

                for (int j = iter + iter; j < nums.size(); j += iter)

                    nums[j] = false;

            }

            else

                continue;

 

        }

        if (foundPrimes == AIM)

            break;

    }

 

    cout << "Last prime: " << primeNums[AIM -1];

 

    clock_t end = clock();

    cout << "\n----------------" << endl;

    cout << "Time: " << double(end - start) / CLOCKS_PER_SEC << " sec" << endl;

    return 0;

}









. But the results are unlikely to change significantly. Perhaps the Atkin sieve will be able to show itself better - it is this algorithm that is most often used in modern cryptography.

* - unknown to me. halyavin said in the first commentary what has been known for a long time.

UPD
Regarding memory consumption, the described algorithms behave as follows:

1 million prime numbers, maximum RAM
- enumeration of dividers: ~ 4MB
- modified sieve of Eratosthenes: ~ 6MB

10 million prime numbers, maximum RAM
- enumeration of dividers: ~ 39MB
- modified sieve Eratosthenes: ~ 61Mb

Also popular now: