Huge Pages in PostgreSQL

    PostgreSQL since version 9.4 introduced support for large pages. This is very good news, I met with large pages when I worked with virtualization. Briefly what is it about. In Linux, working with memory is based on accessing pages that are 4kB in size (in fact, it depends on the platform, you can check it with getconf PAGE_SIZE), and so when the amount of memory exceeds several tens, or even hundreds of gigabytes, it becomes more difficult to manage it, the overhead of addressing memory and maintaining page tables increases. To make life easier, large pages were invented, the size of which can be 2MB or even 1GB. By using large pages, you can get a tangible increase in speed and increase responsiveness in applications that are actively working with memory. As I already noted, for the first time I came across large pages when working with virtualization, in particular with KVM. Tests conducted at the time showed that the performance increase of virtual machines ranged from 7 to 10% (this whole thing was measured by synthetic tests of various services such as redis / memcache / postgres / etc inside virtual machines). Now it has appeared in PostgreSQL.

    image


    So back to the topic of the article, to support large pages in PostgreSQL. To be honest, I have been waiting for this for a long time. In general, running PostgreSQL with support for large pages was possible earlier using libhugetlbfs . However, there is now built-in support. So below is a description of the process of setting up and running PostgreSQL with support for large pages.

    First you need to make sure that the kernel supports large pages. We check the kernel config for the presence of the CONFIG_HUGETLBFS and CONFIG_HUGETLB_PAGE options.
    # grep HUGETLB /boot/config-$(uname -r)
    CONFIG_CGROUP_HUGETLB=y
    CONFIG_HUGETLBFS=y
    CONFIG_HUGETLB_PAGE=y
    


    In the absence of these options, nothing will work and the kernel should be rebuilt (relevant for Gentoo for example).
    Obviously, we will need PostgreSQL version 9.4. I leave the installation of packages and cluster initialization behind the scenes, because Depending on the distribution, the method will be different. We proceed immediately to the postgresql.conf configuration file. The huge_page parameter is responsible for supporting large pages, which can take three values, off - do not use large pages, on - use large pages, try- try to use large pages and, in the event of inaccessibility, roll back to using regular pages. The try value is used by default and is a safe option. In the case of on, the service will not start if large pages are not defined in the system (or there are not enough of them). If run, you may get this error:

    FATAL: could not map anonymous shared memory: Cannot allocate memory
    HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space or huge pages. To reduce the request size (currently 148324352 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.

    So, we correct postgresql.conf (my postgresql.conf is located in the place standard for RHEL-based distributions):

    # vi /var/lib/pgsql/9.4/data/postgresql.conf
    huge_page = try
    


    Now we enable support for large pages in the system, by default they are not involved. The calculation of the pages is approximate and here you should rely on how much memory you are ready to allocate for the needs of the DBMS. I note that the value is measured in 2Mb pages, if you want to allocate 16GB, it will be 8000 pages.

    The official documentation suggests relying on the VmPeak value from the status file which is located in the / proc / PID / directory which corresponds to the postmaster process number. VmPeak, as the name implies, is the peak value of virtual memory usage. This option allows you to determine the minimum bar from which to start, but in my opinion this method of determination is also random in nature.

    # head -1 /var/lib/pgsql/9.4/data/postmaster.pid
    3076
    # grep ^VmPeak /proc/3076/status
    VmPeak:  4742563 kB
    # echo $((4742563 / 2048 + 1))
    2316
    # echo 'vm.nr_hugepages = 2316' >> /etc/sysctl.d/30-postgresql.conf# sysctl -p --system


    We proceed to launch PostgreSQL. Depending on the initialization system, the starting method may differ, I have a trendy youth systemd.

    # systemctl start postgresql-9.4.service


    Recycling large pages can be found here.

    # grep ^HugePages /proc/meminfo
    HugePages_Total:    2316
    HugePages_Free:     2301
    HugePages_Rsvd:      128
    HugePages_Surp:        0
    


    That's all, you can go to the benchmarks with your specific workloads. Thanks for attention!

    Also popular now: