Prime Number Database

Just now carried away by prime numbers. Their secret attracts me.

Wrote an algorithm similar to the sieve of Eratosthenes. For 3 hours, the program found 700 thousand first primes. And I need at least 14 million primes to multiply them to get a number with the number of decimal digits equal to 100 million pieces.

From the article “Once again about the search for prime numbers” written by a user of Bodigrim , I learned about the existence of a quick program primegen , which works using the Atkin sieve . Installed it in a LUbuntu virtual machine (VirtualBox). Indeed, primegen is very fast!

Then the question arose, how to save 14 million primes? You can simply write each prime to a file as int32. And if a prime number is more than 32-bit power?

It occurred to me to write not the numbers themselves, but the distances between them into a file. And the distance between adjacent primes should always be small, suggested that it fits in one byte.

It remains to find out the maximum possible distance for a certain range of numbers. Since the difference between primes is always an even number (except for the distance between 2 and 3), we divide the distance by 2.

In the primegen program in the source file primes.c, instead of displaying the number on the screen, I implemented an algorithm for calculating statistics on the number of distances between numbers:

 int RastCount_Index = 0;
 int RastCount[1000];
 for(i=0;i < 1000; i++) RastCount[i] = 0;
 for (;;) {
  u = primegen_next(&pg) - p;
  p += u;
  if (p > high) break;
  for (i = 0;u;++i)
   {
    u += digits[i];
    if (u >= 200)
     {
      digits[i] = u % 10;
      u = u / 10;
     }
    else
     {
      digits[i] = mod10[u];
      u = div10[u];
     }
   }
  if (i > len) len = i;
  int LetsRast, index;
  LetsRast = 0;  index = 0;
  char r[40], r_old[40];
  for (i = 0;i < 40; i++) { r[i] = 0; r_old[i] = 0; }
  for (i = len - 1;i >= 0;--i)
   {
    if (! LetsRast)
    if (digits_old[i] != digits[i]) LetsRast = 1;
    if (LetsRast)
     {
      r[index] = '0' + digits[i];
      r_old[index] = '0' + digits_old[i];
      index++;
     }
   }
  int ri, ri_old, Rast;
  ri = atoi(r);
  ri_old = atoi(r_old);
  Rast = (ri - ri_old) >> 1;
  RastCount[Rast]++;
  if (Rast > RastCount_Index) RastCount_Index = Rast;
  for (i = len-1;i >= 0; i--)
   digits_old[i] = digits[i];
 }
 for(i = 0; i <= RastCount_Index; i++)
  printf("%i = %i\n", i, RastCount[i]);


Performed in the terminal:

./primes 1 1000000000

After 10 seconds, a list is displayed:
0 = 1 (distance between numbers 2 and 3)
1 = 3424507
...
141 = 1

Thus, 141 is the maximum possible prime distance of up to 1 billion.

Wrote the code to write to the file:

FILE* fd;
fd = fopen("primes.bin", "w+");
unsigned char b1[1];
b1[0] = Rast;
fwrite(b1,1,1,fd);
fclose(fd);

Launched up to 300 million:

./primes 1 300000000

In the file primes.bin received 16 million first primes. Compressed by the archiver 7-Zip, the file shrunk to 9 MB.

PS There is an idea how to compress the database of primes even more strongly. It is necessary to divide the prime numbers into 4 groups according to the last decimal digit: 1, 3, 7, 9. Divide the distance between the numbers by 10. Also create 4 files. At the same time, it is possible that less than 8 bits can be allocated for distance values, but less, then it will be necessary to implement the formation of a byte buffer from, for example, 5-bit distances.

Although in our age of Gigabyte capacities, this is not very important.

Also popular now: