Billion tables ?!

Original author: Josh Berkus
  • Transfer
Thanks to the presentation Dekkelman Selena ( Selena Deckelmann ) on pgCon , some of us are involved in the discussion on the topic "How many tables can theoretically pull PostgreSQLĀ». In haste, a script was written with the bold hope of creating one billion tables of the following form:

CREATE TABLE tab_### (
   id SERIAL NOT NULL PRIMARY KEY,
   value TEXT NOT NULL
);


It should be noted that such a construction will create, among other things, a billion sequences, indices, constraints, and two billion fields.

The Perl script was run on GoGrid cloud hosting in 4 parallel processes. It worked quite tolerably, producing about 300,000 tables per hour, until the disk space ran out.

Based on the fact that 100,000 empty tables take up almost 2GB of disk space, after creating almost 3 million tables, the notorious northern fur animal came to the server. Yes, such that PostgreSQL could only be launched if fsync was disabled : Who would have thought ...

fsync=off



So, if you, dear friend, have a dedicated server with a dozen million free gigabytes, try yourself. Try to create a billion tables.

Note Translator : Sequences will be automatically created for SERIAL fields, indexes will be created for PRIMARY KEY constraints. But that is not all. Since the second field is of the TEXT type, the server will also create a TOAST table for each table along with a unique index.

In the comments to the original article, such mathematical calculations were given. Each table actually pulls the creation of 5 objects: the table itself, the sequence, the index, the TOAST table, the index on the TOAST table. Each such group eats up on a 24KB drive (3 x 8 + 2 x 0). Even ignoring the size of the system catalog, such a number of objects will take 22.3TB.

If we add to this the size of the system catalog, then the size will increase by 12TB and amount to 34TB.

Also popular now: