Millions of Queries per Second: A Peaceful Battle Between PostgreSQL and MySQL with Today's Workload Requirements

Original author: Anastasia Raspopina and Sveta Smirnova
  • Transfer
We have already mentioned that this year the theme of the PG Day'17 Russia conference has expanded significantly. Together with Percona, we formed a separate stream of presentations on MySQL / NoSQL . In addition to reports from leading experts in open databases and no sql solutions, 2 exclusive master classes from leading Percona experts - Peter Zaitsev and Sveta Smirnova, will also be held during the conference .



The master classes will cover a variety of topics on MySQL databases: creating and using a test server, the subtleties of debugging slow queries, features of locking systems, the impact of equipment and configuration on performance, data collection with minimal load on the server.

Today we bring to your attention a translation of a short review in which Sveta Smirnova , Percona Technical Support Senior Engineer and Anastasia Raspopina , a marketing specialist, compare how PostgreSQL and MySQL handle millions of queries per second.

On July 5, for participants of PG Day'17, Svetlana will talk in more detail about the architecture of the MySQL server and the specifics of working with its various parts, such as the optimizer, table engines, and locking systems.

Anastasia: Can open source databases handle millions of queries per second? Many open source advocates will answer yes. However, the allegations are not sufficient for substantiated evidence. That is why in this article we share the test results from Alexander Korotkov (Director of Development, Postgres Professional ) and Sveta Smirnova (Chief Maintenance Engineer, Percona). Benchmarking performance of PostgreSQL 9.6 and MySQL 5.7 will be especially useful for multi-database environments.

The idea of ​​this study is to provide an honest comparison of two popular DBMSs. Sveta and Alexander wanted to test the latest versions of MySQL and PostgreSQL using the same tool under the same complex workloads and using the same configuration parameters (where possible). However, as the PostgreSQL and MySQL ecosystems evolved independently, with the standard testing tools ( pgbench and SysBench ) used for each database, this was not easy.

The task fell on database experts with many years of practical experience. Sveta worked as a senior chief technical support engineer in the MySQL support team's error checking team at Oracle for more than eight years, and since 2015, she worked as chief technical engineer at Percona. Alexander Korotkov is one of the main developers of PostgreSQL and the developer of a number of PostgreSQL functions, including the CREATE ACCESS METHOD command, the general WAL interface, non-blocking Pin / UnpinBuffer, index search for regular expressions and much more. So, we got a pretty decent cast for this play!

Sveta : Dmitry Kravchukregularly publishes detailed test results for MySQL, so the task was not to confirm that MySQL can execute millions of queries per second. As our graphs show, we have already overcome this mark. As a support engineer, I often come across clients working in a heterogeneous environment with different databases, and I wanted to understand the effect of transferring tasks from one database to another. Therefore, I was glad to be able to work with Postgres Professional and identify the strengths and weaknesses of these two databases.

We wanted to test both databases on the same hardware using the same tools and tests. We were going to test the basic functionality, and then work on more detailed comparisons. Thus, we could compare different real-life usage scenarios and their most popular variations.

Spoiler : We are still far from the final results. This is the beginning of a series of articles.

Open Source Databases on Large Machines, Series 1: “It Was Close ...”
Postgres Professional and Freematiq provided two powerful, modern machines for testing.

Hardware Configuration:
Processors: physical = 4, cores = 72, virtual = 144, hyperthreading = yes
Memory: 3 TB
Disk speed: about 3K IOPS
OS: CentOS 7.1.1503
File system: XFS
I also used a less efficient Percona machine.
Hardware configuration:
Processors: physical = 2, cores = 12, virtual = 24, hyperthreading = yes
Memory: 251.9 GB
Disk speed: about 33K IOPS
OS: Ubuntu 14.04.5 LTS
File system: EXT4

Please note that machines with fewer processor cores and faster disks are more often used for MySQL installations than machines with more cores.
The first thing we needed to agree on was which tool to use. A fair comparison only makes sense if the workloads are as close as possible.

The standard PostgreSQL performance testing tool is pgbench , and for MySQL, SysBench . SysBench supports several database drivers and scripts for tests in the Lua programming language, so we decided to use this tool for both databases.

The initial plan was to convert pgbench tests to SysBench syntax on Lua, and then run standard tests for both databases. Having received the first results, we modified our tests to better study the specific features of MySQL and PostgreSQL.
I converted pgbench tests to SysBench syntax and put the tests into the open-database-bench repository on GitHub.

And then we both faced difficulties.

As I already wrote, I also ran tests on a Percona machine. For this converted test, the results were almost identical:

Percona Machine:
 OLTP test statistics:
       transactions:                        1000000 (28727.81 per sec.)
       read/write requests:                 5000000 (143639.05 per sec.)
       other operations:                    2000000 (57455.62 per sec.)

Freematiq machine:
OLTP test statistics:
       transactions:                        1000000 (29784.74 per sec.)
       read/write requests:                 5000000 (148923.71 per sec.)
       other operations:                    2000000 (59569.49 per sec.)

I started to understand. The only thing the Percona machine outperformed Freematiq in was disk speed. So I started running the read-only pgbench test, which was identical to the point select SysBench test with a full set of data in memory. But this time, SysBench used 50% of the available CPU resources:

PID  USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
4585 smirnova  20   0  0,157t 0,041t   9596 S  7226  1,4  12:27.16 mysqld
8745 smirnova  20   0 1266212 629148   1824 S  7126  0,0   9:22.78 sysbench

Alexander, in turn, had problems with SysBench, which could not create a high load on PostgreSQL when using prepared statements:

93087 korotkov  20   0 9289440 3,718g   2964 S 242,6  0,1   0:32.82 sysbench
93161 korotkov  20   0 32,904g  81612  80208 S   4,0  0,0   0:00.47 postgres
93116 korotkov  20   0 32,904g  80828  79424 S   3,6  0,0   0:00.46 postgres
93118 korotkov  20   0 32,904g  80424  79020 S   3,6  0,0   0:00.47 postgres
93121 korotkov  20   0 32,904g  80720  79312 S   3,6  0,0   0:00.47 postgres
93128 korotkov  20   0 32,904g  77936  76536 S   3,6  0,0   0:00.46 postgres
93130 korotkov  20   0 32,904g  81604  80204 S   3,6  0,0   0:00.47 postgres
93146 korotkov  20   0 32,904g  81112  79704 S   3,6  0,0   0:00.46 postgres

We contacted SysBench author Alexei Kopytov and he proposed the following solution for MySQL:

• use SysBench with the parameters --percentile = 0 and --max-requests = 0 (reasonable use of CPU);
• use the concurrency_kit branch (best concurrency and Lua processing);
• rewrite Lua scripts to support prepared statements (pull request: github.com/akopytov/sysbench/pull/94 );
• run both SysBench and mysqld with the preloaded jemalloc or tmalloc library.

A fix for PostgreSQL is on its way. At the moment, Alexander has converted the standard SysBench test to the pgbench format, and on this we stalled. Not much new to MySQL, but at least we had a starting point for comparison.

The next difficulty I came across is the default operating system settings. In short, I changed them to the recommended ones (described below):

vm.swappiness=1
cpupower frequency-set --governor performance
kernel.sched_autogroup_enabled=0
kernel.sched_migration_cost_ns= 5000000
vm.dirty_background_bytes=67108864
vm.dirty_bytes=536870912
IO scheduler [deadline]

The same options were better for PostgreSQL performance. Alexander set up his car in the same way.

After solving these problems, we learned and implemented the following:

• we cannot use one tool (for now);
• Alexander wrote a test for pgbench, simulating standard SysBench tests;
• we still cannot write custom tests, because we use different tools.

But we could use these tests as a starting point. After the work done by Alexander, we were stuck in standard SysBench tests. I converted them to use prepared statements , and Alexander converted them to pgbench format .

It is worth mentioning that I could not get the same results as Dmitry’s for the Read Only and Point Select tests. They are similar, but a little slower. We need to figure out if this is the result of using different hardware or a lack of performance testing skills on my part. Read-Write test results match.

There was another difference between PostgreSQL and MySQL tests. MySQL users usually have many connections. Setting the value of the max_connections variable to limit the total number of concurrent connections to thousands is not uncommon these days. Although not recommended, people use this feature even without the thread pool plugin. In real life, most of these compounds are inactive. But there is always a chance that they will all be involved in case of increased activity of the website.

For MySQL, I tested up to 1024 connections. I used the powers of two and the factors of the number of cores: 1, 2, 4, 8, 16, 32, 36, 64, 72, 128, 144, 256, 512 and 1024 threads.

It was more important for Alexander to conduct the test in smaller steps. He started with one thread and increased by 10 threads until he reached 250 parallel threads. Thus, you will see a more detailed graph for PostgreSQL, but there are no results after 250 threads.
Here are the comparison results.

Point SELECT

image

pgsql-9.6 - standard PostgreSQL
pgsql-9.6 + pgxact-align - PostgreSQL with this patch (more details can be found inthis article )
MySQL-5.7 Dimitri - Oracle Server MySQL
MySQL-5.7 Sveta - Percona Server 5.7.15

OLTP RO

image

OLTP RW

image

The function of synchronous transaction confirmation (sync commit) in PostgreSQL is similar to innodb_flush_log_at_trx_commit = 1 in InnoDB, and asynchronous commit is similar to innodb_flt_log__tush_logush_logush_logush_logush_logush_logush_flush_logush_logush_flush_logush_flush_logush .
You see that the results are very similar: both databases are developing very fast and work well with modern equipment.

MySQL results that show 1024 threads for reference.
Point SELECT and OLTP RO

image

OLTP RW with innodb_flush_log_at_trx_commit set to 1 and 2

image

Having obtained these results, we conducted several special tests, which will be considered in separate articles.

Additional Information


MySQL options for OLTP RO and Point SELECT tests:
# general
table_open_cache = 8000
table_open_cache_instances=16
back_log=1500
query_cache_type=0
max_connections=4000
# files
innodb_file_per_table
innodb_log_file_size=1024M
innodb_log_files_in_group=3
innodb_open_files=4000
# Monitoring
innodb_monitor_enable = '%'
performance_schema=OFF #cpu-bound, matters for performance
#Percona Server specific
userstat=0
thread-statistics=0
# buffers
innodb_buffer_pool_size=128000M
innodb_buffer_pool_instances=128 #to avoid wait on InnoDB Buffer Pool mutex
innodb_log_buffer_size=64M
# InnoDB-specific
innodb_checksums=1 #Default is CRC32 in 5.7, very fast
innodb_use_native_aio=1
innodb_doublewrite= 1 #https://www.percona.com/blog/2016/05/09/percona-server-5-7-parallel-doublewrite/
innodb_stats_persistent = 1
innodb_support_xa=0 #(We are read-only, but this option is deprecated)
innodb_spin_wait_delay=6 #(Processor and OS-dependent)
innodb_thread_concurrency=0
join_buffer_size=32K
innodb_flush_log_at_trx_commit=2
sort_buffer_size=32K
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10
innodb_lru_scan_depth=4000
innodb_page_cleaners=4
# perf special
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
innodb_read_io_threads = 4
innodb_write_io_threads = 4
innodb_io_capacity=2000
innodb_io_capacity_max=4000
innodb_purge_threads=4
innodb_max_purge_lag_delay=30000000
innodb_max_purge_lag=0
innodb_adaptive_hash_index=0 (depends on workload, always check)

MySQL Options for OLTP RW:
#Open files
table_open_cache = 8000
table_open_cache_instances = 16
query_cache_type = 0
join_buffer_size=32k
sort_buffer_size=32k
max_connections=16000
back_log=5000
innodb_open_files=4000
#Monitoring
performance-schema=0
#Percona Server specific
userstat=0
thread-statistics=0
#InnoDB General
innodb_buffer_pool_load_at_startup=1
innodb_buffer_pool_dump_at_shutdown=1
innodb_numa_interleave=1
innodb_file_per_table=1
innodb_file_format=barracuda
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_doublewrite=1
innodb_support_xa=1
innodb_checksums=1
#Concurrency
innodb_thread_concurrency=144
innodb_page_cleaners=8
innodb_purge_threads=4
innodb_spin_wait_delay=12 Good value for RO is 6, for RW and RC is 192
innodb_log_file_size=8G
innodb_log_files_in_group=16
innodb_buffer_pool_size=128G
innodb_buffer_pool_instances=128 #to avoid wait on InnoDB Buffer Pool mutex
innodb_io_capacity=18000
innodb_io_capacity_max=36000
innodb_flush_log_at_timeout=0
innodb_flush_log_at_trx_commit=2
innodb_flush_sync=1
innodb_adaptive_flushing=1
innodb_flush_neighbors = 0
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10
innodb_lru_scan_depth=4000
innodb_adaptive_hash_index=0
innodb_change_buffering=none #can be inserts, workload-specific
optimizer_switch="index_condition_pushdown=off" #workload-specific

MySQL SysBench Options:
LD_PRELOAD=/data/sveta/5.7.14/lib/mysql/libjemalloc.so
 /data/sveta/sbkk/bin/sysbench
 [ --test=/data/sveta/sysbench/sysbench/tests/db/oltp_prepared.lua | --test=/data/sveta/sysbench/sysbench/tests/db/oltp_simple_prepared.lua ]
 --db-driver=mysql --oltp-tables-count=8 --oltp-table-size=10000000
--mysql-table-engine=innodb --mysql-user=msandbox --mysql-password=msandbox
 --mysql-socket=/tmp/mysql_sandbox5715.sock
--num-threads=$i --max-requests=0 --max-time=300
--percentile=0 [--oltp-read-only=on --oltp-skip-trx=on]

PostgreSQL pgbench options:
$ git clone https://github.com/postgrespro/pg_oltp_bench.git
$ cd pg_oltp_bench
$ make USE_PGXS=1
$ sudo make USE_PGXS=1 install
$ psql DB -f oltp_init.sql
$ psql DB -c "CREATE EXTENSION pg_oltp_bench;"
$ pgbench -c 100 -j 100 -M prepared -f oltp_ro.sql -T 300 -P 1 DB
$ pgbench -c 100 -j 100 -M prepared -f oltp_rw.sql -T 300 -P 1 DB

Features in MySQL 5.7 that significantly improved performance:
  • InnoDB: Transaction List Optimization:
  • InnoDB: contention reduction for lock_sys_t :: mutex:
  • InnoDB: fixing contention index-> ​​lock:
  • InnoDB: Faster and parallel flushing of pages to disk:
  • MDL scalability (Meta-Data Lock):

Anastasia : The initial results of this study were unveiled at Percona Live Amsterdam 2016 . New interesting results were added to the second version of the same speech, which was presented at Moscow HighLoad ++ 2016 .

Further versions of the study will be available to all participants in the Sveta workshop on PG Day'17. If you have questions and suggestions about what aspects of PostgreSQL and MySQL performance you would like to know more, leave comments, we will certainly take into account your wishes!

Also popular now: