Why PostgreSQL is better than other open source SQL databases. Part 1
- Transfer
Today let's talk about the benefits of Postgres over other open source systems. We will definitely reveal this topic in more detail on PG Day'16 Russia, which is only two months away.
Perhaps you are asking yourself: “Why PostgreSQL?” After all, there are other options for open source relational databases (in this article we looked at MySQL, MariaDB and Firebird), so what can Postgres offer that they don’t have? The PostgreSQL slogan claims to be "The Most Advanced Open Source Database in the World." We will give several reasons why Postgres makes such statements.
In the first part of this series we will talk about data storage - the model, structure, types and size limits. And in the second part focus more on sampling and data manipulation.
PostgreSQL is not just relational, but an object-relational DBMS. This gives it some advantages over other open source SQL databases such as MySQL, MariaDB, and Firebird.
The fundamental characteristic of an object-relational database is support for user objects and their behavior, including data types, functions, operations, domains, and indexes. This makes Postgres incredibly flexible and reliable. Among other things, it can create, store, and retrieve complex data structures. In some examples below, you will see nested and composite structures that are not supported by standard RDBMSs.
There is an extensive list of data types that Postgres supports. In addition to numeric, floating point, text, boolean and other expected data types (as well as many variations of them), PostgreSQL boasts support for uuid, money, enumerated, geometric, binary types, network addresses, bit strings, text search, xml, json , arrays, composite types and ranges, as well as some internal types for identifying objects and the location of the logs. In fairness it is worth saying that MySQL, MariaDB and Firebird also have some of these data types, but only Postgres supports them all.
Let's take a closer look at some of them:
PostgreSQL provides storage for various types of network addresses. The CIDR data type (Classless Internet Domain Routing) follows the convention for IPv4 and IPv6 network addresses. Here are some examples:
Also available for storing network addresses is the INET data type used for IPv4 and IPv6 hosts, where subnets are optional. The MACADDR data type can be used to store MAC addresses for equipment identification, such as 08-00-2b-01-02-03.
MySQL and MariaDB also have INET functions for converting network addresses, but they do not provide data types for internal storage of network addresses. Firebird also has no types for storing network addresses.
Because Postgres is an object-relational database, arrays of values can be stored for most existing data types. This can be done by adding square brackets to the data type specification for the column or using the ARRAY expression. The size of the array can be specified, but this is optional. Let's look at the holiday picnic menu to demonstrate the use of arrays:
MySQL, MariaDB, and Firebird don't do that. To store such arrays of values in traditional relational databases, you have to use a workaround and create a separate table with rows for each of the values of the array.
Location data is quickly becoming a major requirement for many applications. PostgreSQL has long supported many geometric data types, such as points, lines, circles, and polygons. One of these types is PATH, it consists of many consecutive points and can be open (start and end points are not connected) or closed (start and end points are connected). Let's take an example of a hiking trail. In this case, the hiking trail is a loop, so the start and end points are connected, and therefore my path is closed. Parentheses around a set of coordinates indicate a closed path, and square brackets indicate an open one.
The PostGIS extension for PostgreSQL complements existing geometric data properties with auxiliary spatial types, functions, operators, and indexes. It provides location support and supports both raster and vector data. It also provides compatibility with many third-party geospatial tools (copyrighted and open source) for displaying, rendering, and working with data.
Note that in MySQL 5.7.8 and MariaDB, starting with version 5.3.3, data type extensions have been added to support the OpenGIS geographic information standard. This version of MySQL and subsequent versions of MariaDB offer data type storage similar to Postgres regular geodata. However, in MySQL and MariaDB, data values must first be converted to the geometric format with simple commands before being inserted into the table. Firebird does not currently support geometric data types.
PostgreSQL's JSON support allows you to move on to storing schema-less data in an SQL database. This can be useful when the data structure requires some flexibility: for example, if during the development process the structure is still changing or it is not known which fields the data object will contain.
The JSON data type provides JSON validation, which allows you to use specialized JSON operators and functions built into Postgres to execute queries and manipulate data. The JSONB type is also available - a binary variant of the JSON format, in which spaces are removed, the sorting of objects is not preserved, instead they are stored in the most optimal way, and only the last value for duplicate keys is stored. JSONB is usually the preferred format because it requires less space for objects, can be indexed and processed faster, since it does not require repeated parsing.
In MySQL 5.7.8 and MariaDB 10.0.1, support for built-in JSON objects has been added. But, although there are many functions and operators for JSON that are now available in these databases, they are not indexed like JSONB in PostgreSQL. Firebird has not yet joined the trend and only supports JSON objects as text.
If it suddenly happens that you do not find an extensive list of Postgres data types, you can use the CREATE TYPE command to create new data types, such as composite, enumerated, range, and base. Consider the example of creating and sending requests for a new composite type:
Because they are not object-relational, MySQL, MariaDB, and Firebird do not provide such powerful functionality.
PostgreSQL can handle a lot of data. Current published restrictions are listed below:
In Compose [approx. transl .: organization in which the author of the original article works] we will automatically scale your installation so that you do not have to worry about the increase in the amount of data. But, as any database administrator knows, you should be wary of too much and unlimited possibilities. We advise you to use common sense when creating tables and adding indexes.
By comparison, MySQL and MariaDB are notorious for limiting row sizes to 65,535 bytes. Firebird also offers as little as 64Kb as the maximum line size. Typically, the amount of data is limited by the maximum file size of the operating system. Since PostgreSQL can store tabular data in many smaller files, it can circumvent this limitation. But it is worth noting that too many files can adversely affect performance. MySQL and MariaDB support a larger number of columns in a table (up to 4.096 depending on the data type) and larger individual table sizes than PostgreSQL, but the need to exceed existing Postgres restrictions arises only in extremely rare cases.
Postgres strives to comply with the ANSI-SQL: 2008 standard, meets ACID requirements (atomicity, consistency, isolation, and reliability) and is known for its referential and transactional integrity. Primary keys, limiting and cascading foreign keys, unique constraints, NOT NULL constraints, validation constraints, and other data integrity functions ensure that only valid data is saved.
MySQL and MariaDB are more committed to meeting the SQL standard with InnoDB / XtraDB table engines. Now they offer the STRICT option using SQL modes, which sets the validation of the data used. Despite this, depending on what mode you use, false or even truncated data without your knowledge may be inserted or created during the update. None of these databases currently support CHECK restrictions. In addition, they have many features with regard to referential integrity restrictions on foreign keys. In addition to the above, data integrity can be significantly affected depending on the storage engine selected. MySQL (and fork MariaDB) make no secret of exchanging integrity and compliance with standards for speed and efficiency.
Postgres has many possibilities. Created using an object-relational model, it supports complex structures and a wide range of built-in and user-defined data types. It provides enhanced data capacity and has earned trust with a respectful attitude towards data integrity. You may not need all of the advanced storage features that we explored in this article, but because your needs can grow quickly, there is a definite advantage to having it all at hand.
If PostgreSQL doesn’t seem to fit your needs, or you prefer to shoot from the hip, then you should pay attention to the NoSQL databases that we offer in Compose or think about the other SQL databases that we mentioned. Each of them has its own advantages. Compose is firmly convinced that it is very important to choose the right database for a specific task ... sometimes this means that you need to select multiple databases!
Want more Postgres? In the second part of this series, we look at data manipulation and search in PostgreSQL, including virtual table functions, query capabilities, indexing, and language extensions.
Perhaps you are asking yourself: “Why PostgreSQL?” After all, there are other options for open source relational databases (in this article we looked at MySQL, MariaDB and Firebird), so what can Postgres offer that they don’t have? The PostgreSQL slogan claims to be "The Most Advanced Open Source Database in the World." We will give several reasons why Postgres makes such statements.
In the first part of this series we will talk about data storage - the model, structure, types and size limits. And in the second part focus more on sampling and data manipulation.
Data model
PostgreSQL is not just relational, but an object-relational DBMS. This gives it some advantages over other open source SQL databases such as MySQL, MariaDB, and Firebird.
The fundamental characteristic of an object-relational database is support for user objects and their behavior, including data types, functions, operations, domains, and indexes. This makes Postgres incredibly flexible and reliable. Among other things, it can create, store, and retrieve complex data structures. In some examples below, you will see nested and composite structures that are not supported by standard RDBMSs.
Structures and data types
There is an extensive list of data types that Postgres supports. In addition to numeric, floating point, text, boolean and other expected data types (as well as many variations of them), PostgreSQL boasts support for uuid, money, enumerated, geometric, binary types, network addresses, bit strings, text search, xml, json , arrays, composite types and ranges, as well as some internal types for identifying objects and the location of the logs. In fairness it is worth saying that MySQL, MariaDB and Firebird also have some of these data types, but only Postgres supports them all.
Let's take a closer look at some of them:
Network addresses
PostgreSQL provides storage for various types of network addresses. The CIDR data type (Classless Internet Domain Routing) follows the convention for IPv4 and IPv6 network addresses. Here are some examples:
- 192.168.100.128/25
- 10.1.2.3/32
- 2001: 4f8: 3: ba: 2e0: 81ff: fe22: d1f1 / 128
- :: ffff: 1.2.3.0/128
Also available for storing network addresses is the INET data type used for IPv4 and IPv6 hosts, where subnets are optional. The MACADDR data type can be used to store MAC addresses for equipment identification, such as 08-00-2b-01-02-03.
MySQL and MariaDB also have INET functions for converting network addresses, but they do not provide data types for internal storage of network addresses. Firebird also has no types for storing network addresses.
Multidimensional Arrays
Because Postgres is an object-relational database, arrays of values can be stored for most existing data types. This can be done by adding square brackets to the data type specification for the column or using the ARRAY expression. The size of the array can be specified, but this is optional. Let's look at the holiday picnic menu to demonstrate the use of arrays:
-- создаем таблицу, у которой значения являются массивами
CREATE TABLE holiday_picnic (
holiday varchar(50) -- строковое значение
sandwich text[], -- массив
side text[] [], -- многомерный массив
dessert text ARRAY, -- массив
beverage text ARRAY[4] -- массив из 4-х элементов
);
-- вставляем значения массивов в таблицу
INSERT INTO holiday_picnic VALUES
('Labor Day',
'{"roast beef","veggie","turkey"}',
'{
{"potato salad","green salad","macaroni salad"},
{"chips","crackers"}
}',
'{"fruit cocktail","berry pie","ice cream"}',
'{"soda","juice","beer","water"}'
);
MySQL, MariaDB, and Firebird don't do that. To store such arrays of values in traditional relational databases, you have to use a workaround and create a separate table with rows for each of the values of the array.
Geometric data
Location data is quickly becoming a major requirement for many applications. PostgreSQL has long supported many geometric data types, such as points, lines, circles, and polygons. One of these types is PATH, it consists of many consecutive points and can be open (start and end points are not connected) or closed (start and end points are connected). Let's take an example of a hiking trail. In this case, the hiking trail is a loop, so the start and end points are connected, and therefore my path is closed. Parentheses around a set of coordinates indicate a closed path, and square brackets indicate an open one.
-- создаем таблицу для хранения троп
CREATE TABLE trails (
trail_name varchar(250),
trail_path path
);
-- вставляем тропу в таблицу,
-- для которой маршрут определяется координатами в формате широта-долгота
INSERT INTO trails VALUES
('Dool Trail - Creeping Forest Trail Loop',
((37.172,-122.22261666667),
(37.171616666667,-122.22385),
(37.1735,-122.2236),
(37.175416666667,-122.223),
(37.1758,-122.22378333333),
(37.179466666667,-122.22866666667),
(37.18395,-122.22675),
(37.180783333333,-122.22466666667),
(37.176116666667,-122.2222),
(37.1753,-122.22293333333),
(37.173116666667,-122.22281666667)));
The PostGIS extension for PostgreSQL complements existing geometric data properties with auxiliary spatial types, functions, operators, and indexes. It provides location support and supports both raster and vector data. It also provides compatibility with many third-party geospatial tools (copyrighted and open source) for displaying, rendering, and working with data.
Note that in MySQL 5.7.8 and MariaDB, starting with version 5.3.3, data type extensions have been added to support the OpenGIS geographic information standard. This version of MySQL and subsequent versions of MariaDB offer data type storage similar to Postgres regular geodata. However, in MySQL and MariaDB, data values must first be converted to the geometric format with simple commands before being inserted into the table. Firebird does not currently support geometric data types.
JSON Support
PostgreSQL's JSON support allows you to move on to storing schema-less data in an SQL database. This can be useful when the data structure requires some flexibility: for example, if during the development process the structure is still changing or it is not known which fields the data object will contain.
The JSON data type provides JSON validation, which allows you to use specialized JSON operators and functions built into Postgres to execute queries and manipulate data. The JSONB type is also available - a binary variant of the JSON format, in which spaces are removed, the sorting of objects is not preserved, instead they are stored in the most optimal way, and only the last value for duplicate keys is stored. JSONB is usually the preferred format because it requires less space for objects, can be indexed and processed faster, since it does not require repeated parsing.
In MySQL 5.7.8 and MariaDB 10.0.1, support for built-in JSON objects has been added. But, although there are many functions and operators for JSON that are now available in these databases, they are not indexed like JSONB in PostgreSQL. Firebird has not yet joined the trend and only supports JSON objects as text.
Create a new type
If it suddenly happens that you do not find an extensive list of Postgres data types, you can use the CREATE TYPE command to create new data types, such as composite, enumerated, range, and base. Consider the example of creating and sending requests for a new composite type:
-- создаем новый составной тип "wine"
CREATE TYPE wine AS (
wine_vineyard varchar(50),
wine_type varchar(50),
wine_year int
);
-- создаем таблицу, которая использует составной тип "wine"
CREATE TABLE pairings (
menu_entree varchar(50),
wine_pairing wine
);
-- вставляем данные в таблицу при помощи выражения ROW
INSERT INTO pairings VALUES
('Lobster Tail',ROW('Stag''s Leap','Chardonnay', 2012)),
('Elk Medallions',ROW('Rombauer','Cabernet Sauvignon',2012));
/*
выборка из таблицы с использованием имени колонки
(используйте скобки, отделяемые точкой от имени поля
в составном типе)
*/
SELECT (wine_pairing).wine_vineyard, (wine_pairing).wine_type
FROM pairings
WHERE menu_entree = 'Elk Medallions';
Because they are not object-relational, MySQL, MariaDB, and Firebird do not provide such powerful functionality.
Data sizes
PostgreSQL can handle a lot of data. Current published restrictions are listed below:
Maximum database size | Is not limited |
Maximum table size | 32 TB |
Maximum row size | 1.6 TB |
Maximum field size | 1 GB |
The maximum number of rows in the table | Not limited |
The maximum number of columns in a table | 250-1600 depending on the type of column |
The maximum number of indexes in the table | Not limited |
In Compose [approx. transl .: organization in which the author of the original article works] we will automatically scale your installation so that you do not have to worry about the increase in the amount of data. But, as any database administrator knows, you should be wary of too much and unlimited possibilities. We advise you to use common sense when creating tables and adding indexes.
By comparison, MySQL and MariaDB are notorious for limiting row sizes to 65,535 bytes. Firebird also offers as little as 64Kb as the maximum line size. Typically, the amount of data is limited by the maximum file size of the operating system. Since PostgreSQL can store tabular data in many smaller files, it can circumvent this limitation. But it is worth noting that too many files can adversely affect performance. MySQL and MariaDB support a larger number of columns in a table (up to 4.096 depending on the data type) and larger individual table sizes than PostgreSQL, but the need to exceed existing Postgres restrictions arises only in extremely rare cases.
Data integrity
Postgres strives to comply with the ANSI-SQL: 2008 standard, meets ACID requirements (atomicity, consistency, isolation, and reliability) and is known for its referential and transactional integrity. Primary keys, limiting and cascading foreign keys, unique constraints, NOT NULL constraints, validation constraints, and other data integrity functions ensure that only valid data is saved.
MySQL and MariaDB are more committed to meeting the SQL standard with InnoDB / XtraDB table engines. Now they offer the STRICT option using SQL modes, which sets the validation of the data used. Despite this, depending on what mode you use, false or even truncated data without your knowledge may be inserted or created during the update. None of these databases currently support CHECK restrictions. In addition, they have many features with regard to referential integrity restrictions on foreign keys. In addition to the above, data integrity can be significantly affected depending on the storage engine selected. MySQL (and fork MariaDB) make no secret of exchanging integrity and compliance with standards for speed and efficiency.
To summarize
Postgres has many possibilities. Created using an object-relational model, it supports complex structures and a wide range of built-in and user-defined data types. It provides enhanced data capacity and has earned trust with a respectful attitude towards data integrity. You may not need all of the advanced storage features that we explored in this article, but because your needs can grow quickly, there is a definite advantage to having it all at hand.
If PostgreSQL doesn’t seem to fit your needs, or you prefer to shoot from the hip, then you should pay attention to the NoSQL databases that we offer in Compose or think about the other SQL databases that we mentioned. Each of them has its own advantages. Compose is firmly convinced that it is very important to choose the right database for a specific task ... sometimes this means that you need to select multiple databases!
Want more Postgres? In the second part of this series, we look at data manipulation and search in PostgreSQL, including virtual table functions, query capabilities, indexing, and language extensions.