In-Memory OLTP in SQL Server 2014. Part I
In-Memory OLTP functionality (a Hekaton project) is designed to speed up the processing of typical OLTP operations in SQL Server. As you know, the load on the database server, be it Microsoft SQL Server or Oracle / MySQL or SAP / Sybase or IBM DB2, etc., can be divided into two classes: data collection and analysis of what they collected, because why would it be otherwise collected? The first class of tasks is called OLTP (On-Line Transactional Processing). It is characterized by short update transactions involving a relatively small number of tables in the database. Examples of such applications are a banking day, telecom billing, etc. The second class of tasks is called OLAP (On-Line Analytical Processing) and is characterized by massive long reading, covering a significant number of tables and collecting from them, as a rule, to the maximum of records, teeming with predicates of binding, sorting, grouping, aggregate functions, etc. As a rule, the data structures for different classes of tasks are separated so as not to interfere with each other, and if the first database structure is built taking into account the numerous Codd rules, the second, on the contrary, is denormalized and performed according to the “star (snowflake)” scheme.
The striving of DBMS in memory appeared at the beginning of the new millennium, when it turned out that despite Moore’s law, the clock speed and speed of the processors did not increase exponentially, but, on the contrary, reached a flat saturation line despite ILP and other tricks. At the same time, the prices of random access memory, once indecently expensive, are catastrophically declining and have fallen thousands of times compared to the 1990s. Yeah, database server manufacturers told themselves. In 2005, Oracle bought the TimesTen DBMS in-memory, IBM in 2007, the Solid company, and Microsoft in this case did not take anything from the outside, because they decided to educate in their team.
Started with In-Memory OLAP. The first fruits of education were embodied in the ColumnStore engine VertiPaq with the release of SQL Server 2008 R2 in the form of PowerPivot for Excel and SharePoint. As the name implies, the column index is designed so that each page contains a piece of the column in a highly compressed form (~ 2 times better than with the compression that appeared in SQL Server 2008) in order to ram more data into memory. Traditionally, OLAP benefits from column indexes, because, as we recall, this is a massive read. As a rule, it is required to read along the rows (or the range) all the columns according to which the directories (measurement tables, star rays) are connected with the columns of foreign keys in the fact table (hub) in order to build a join (or semi-join) between them. In SQL Server 2012, two useful things happened in this regard. At first,xVelocity (formerly VertiPaq) appeared in Analysis Services as the so-called Tabular Model, an alternative to the traditional multidimensional ones that existed, God forbid, with SQL Server 7.0. Secondly, the column index has ceased to be a thing in itself, and it became possible to build it explicitly as needed - the command CREATE [NONCLUSTERED] COLUMNSTORE INDEX appeared in T-SQL. A column of restrictions was immediately imposed on column indices, the most cruel of which, of course, was UPDATE statement failed because data cannot be updated in a table that has a nonclustered columnstore index. Consider disabling the columnstore index before issuing the UPDATE statement, and then rebuilding the columnstore index after UPDATE has completed. In SQL Server 2014, this evil was tackled with an updated clustered column index. Well, as a column ... To make a column index updated, a delta store and delete bitmap were hung on it. When a record is deleted, it does not physically disappear, and a flag in delete bitmap is set on it. When the record is inserted, it ends up in the delta store. Both that, and another - usual B-Tree (rowstore) with all following pluses and minuses. There is a background process of Tuple Mover, which crawls along the delta store and converts the added records into columnstore segments, but, generally speaking, reading the column index means reading not only columnstore, but also these two mates, because you need to filter the deleted records and make the union added. However, two weeks after its release, SQL Sever 2014 showed record results in independent analytical tests.TPC-H , having taken the first lines in the tournament tables by weight categories 1, 3 and 10 TB of the database volume in the non-cluster (standalone) standings. Thus, we will consider that with in-memory OLAP everything is fine and move on to in-memory OLTP.
As already mentioned, Hekaton is not the code name for the next version of SQL Server, as were Denali, Katmay, Yukon, etc., but, in fact, a project to develop an in-memory engine, i.e. component of the product. This component is the most striking innovation not only in the current version, but also, possibly, on the scale of the entire product line, starting with the 16-bit Ashton-Tate / Microsoft SQL Server 1.0, which was released 25 years ago. Hekaton is a Greek word and means one hundred or self-one hundred = one hundred times, which kagbe hints that it’s not cooler than those with only ten. I would like to immediately warn against the misconception that the Hecaton is a kind of extended version of dbcc pintable, because work with a table attached to memory works like a regular disk, including execution plans, ensuring transactional integrity using locks, etc. Hekaton is a compact independent kernel integrated inside the SQL Server executive mechanism, characterized in comparison with the traditional database engine by the absence of interpreted execution plans, locks as a means of ensuring logical data integrity and latches for physical integrity. Let me remind you that latches (I don’t know how they are ideologically correctly translated into Russian, probably latches) are lightweight locks that are imposed on data pages, index pages, some service structures immediately at the time of their reading or changing in memory, unlike from locks that can operate throughout the transaction. Therefore, without going into details, we can assume that deadlocks do not exist. There are nuances, but not so painful. Another difference is that locks can be controlled (for example hints, isolation level). Latch is a purely run by SQL Server. I will not go into their internal structure in detail, those who wish can turn to BOL or to the pageEvgenia Khabarova . I will not bump into theory at all, let's better get down to examples.
Let's create a database and in it a file group under in-memory OLTP.
Script 1
This is a file-like file-group, data from memory will persist in it, and in the process of recovery. read accordingly. Like clustered columnstore, it consists of data files into which records inserted in the insert or update result are sequentially written and delta files in which identifiers of deleted records are stored. First, the changes, as usual, are reflected in the memory, and with checkpoint blocks of 256K (in the case of data) and 4K (delta) are dumped to the disk, which is noted in the transaction log. Merging data-delta pairs occurs automatically when a certain size is reached and depending on the size of RAM, and can also be done manually by the sys.sp_xtp_merge_checkpoint_files procedure. Read more about this process here .
Thus, the placement of tables in memory does not mean that if the server is chopped off, everything that is acquired by back-breaking labor will be lost. In-Memory OLTP is a fully transactional technology and supports fault tolerance, including AlwaysOn.
In a freshly created database, create a table optimized for working in memory.
Script 2
The last option just means that the table will be placed in memory. Each MEMORY_OPTIMIZED table must have at least one index. The total number must not exceed 8. The fields in the index must not be nullable. For index fields n (var) char, the BIN2 call must be used. There are no cluster indexes, by definition. The structure of the classic B-Tree, too. For tables in memory, indexes are HASH (better suited for point search) and RANGE (as the name suggests, better suited for range scans).
Core component of the hash indexserves as the so-called mapping table, in one column of which the results of applying the hash function to the concatenation of the fields forming the index key are stored, in the other - pointers to records. Since the hash function can give the same results (collisions arise) for completely different values of the argument (for close ones they must be different), these will in fact be pointers to the memory areas where the overflow chains lie. Each such chain is a bi-directional list. The main parameter when creating a hash index is bucket_count. This is the number of slots in the match table. The fewer they are, the higher the probability of collision, the longer the overflow chains will grow from each hash. Accordingly, it is obvious that it should be no less than the number of unique values in the index key. In fact, it is estimated as the number of unique values, rounded up to the next power of 2, andit explains in detail why.
The second type of index available in-memory is called range and is very similar to the classic clustered one. Its nodes form an ordered structure, effective for scans by ranges. Instead of B-Tree, its modification of Bw-Tree is used, the most striking difference of which, perhaps, is that it does not store pointers to duplicate values. If there are a million identical values in the table, the classical tree will stupidly keep a million pointers (on data) in the leaves. Bw manages in this case one that allows dramatically (English-speaking authors love this word very much) to save a place when stuffing this economy in memory. The only thing in this case is overflow chains again - we do not store pointers to all records, how to get to the next one with the same key value? There is a feeling that there is no saving, just bytes on ptr were transferred from leaves to bouquets. But no, read why this is not so, here , and we will move on and create another table.
Script 3
Note the last option highlighted in bold. It means that the data in this table will not be saved between server restarts (the structure itself will remain). As I said above, there is a misconception about Hekaton that everything that is in memory is lost during a restart. So for these tables, this is indeed the case, but you create them absolutely consciously in order to reduce the overhead, in particular, for logging where it is not needed. This is a kind of analogue of temporary tables. By the way, table variables in the Hecaton are also supported. They are declared through the preliminary creation of the table type CREATE TYPE ... AS TABLE ... Unlike ordinary table variables, they are stored, of course, not in disk tempdb, but refer to the database where they were declared.
Tables located in memory do not support automatic updating of statistics, in particular, ALTER DATABASE ... SET AUTO_UPDATE_STATISTICS ON. Also
exec sp_autostats @tblname = 'ShoppingCart'
Index Name AUTOSTATS Last Updated
[ix_UserId] OFF NULL
[PK__Shopping__7A789AE57302F83B] OFF NULL
exec sp_autostats @tblname = 'ShoppingCart', @flagc = 'ON'
- Operations that require a change to does not work version, for example renaming, are not supported with memory optimized tables.
However, statistics can be updated manually: UPDATE STATISTICS dbo.ShoppingCart WITH FULLSCAN, NORECOMPUTE.
Generally, tables in memory have a bunch of limitations. Most table hints are not supported: No TABLOCK, XLOCK, PAGLOCK, ... Does not swear at NOLOCK, but also does not respond, as if it did not exist. Dynamic and keyset cursors silently translate into static. The TRUNCATE TABLE and MERGE statements are not supported (when the table in memory is the destination). There are restrictions on the types of fields used. You can read about them in detail here , but in order not to be upset, we will see what happened.
Script 4
In the SQL Server installation directory C: \ Program Files \ Microsoft SQL Server \ ... \ DATA \ xtp \ 11 \ two dlls appeared, which are called xtp_t_11_ <9 digits> .dll. These are our ShoppingCart and UserSession tables.
The built-in compiler converts T-SQL definitions of tables and stored procedures into C code (you can see it in the same directory), from which the machine code is obtained. The corresponding dynamic libraries are loaded into memory and linked inside the SQL Server process. When you restart SQL Server, the libraries are compiled and reloaded based on the catalog information from the metadata.
In the next part, we plan to consider the native compilation of stored procedures, transaction isolation levels in the Hecaton, locks, journaling, and overall performance compared to traditional disk objects.
The striving of DBMS in memory appeared at the beginning of the new millennium, when it turned out that despite Moore’s law, the clock speed and speed of the processors did not increase exponentially, but, on the contrary, reached a flat saturation line despite ILP and other tricks. At the same time, the prices of random access memory, once indecently expensive, are catastrophically declining and have fallen thousands of times compared to the 1990s. Yeah, database server manufacturers told themselves. In 2005, Oracle bought the TimesTen DBMS in-memory, IBM in 2007, the Solid company, and Microsoft in this case did not take anything from the outside, because they decided to educate in their team.
Started with In-Memory OLAP. The first fruits of education were embodied in the ColumnStore engine VertiPaq with the release of SQL Server 2008 R2 in the form of PowerPivot for Excel and SharePoint. As the name implies, the column index is designed so that each page contains a piece of the column in a highly compressed form (~ 2 times better than with the compression that appeared in SQL Server 2008) in order to ram more data into memory. Traditionally, OLAP benefits from column indexes, because, as we recall, this is a massive read. As a rule, it is required to read along the rows (or the range) all the columns according to which the directories (measurement tables, star rays) are connected with the columns of foreign keys in the fact table (hub) in order to build a join (or semi-join) between them. In SQL Server 2012, two useful things happened in this regard. At first,xVelocity (formerly VertiPaq) appeared in Analysis Services as the so-called Tabular Model, an alternative to the traditional multidimensional ones that existed, God forbid, with SQL Server 7.0. Secondly, the column index has ceased to be a thing in itself, and it became possible to build it explicitly as needed - the command CREATE [NONCLUSTERED] COLUMNSTORE INDEX appeared in T-SQL. A column of restrictions was immediately imposed on column indices, the most cruel of which, of course, was UPDATE statement failed because data cannot be updated in a table that has a nonclustered columnstore index. Consider disabling the columnstore index before issuing the UPDATE statement, and then rebuilding the columnstore index after UPDATE has completed. In SQL Server 2014, this evil was tackled with an updated clustered column index. Well, as a column ... To make a column index updated, a delta store and delete bitmap were hung on it. When a record is deleted, it does not physically disappear, and a flag in delete bitmap is set on it. When the record is inserted, it ends up in the delta store. Both that, and another - usual B-Tree (rowstore) with all following pluses and minuses. There is a background process of Tuple Mover, which crawls along the delta store and converts the added records into columnstore segments, but, generally speaking, reading the column index means reading not only columnstore, but also these two mates, because you need to filter the deleted records and make the union added. However, two weeks after its release, SQL Sever 2014 showed record results in independent analytical tests.TPC-H , having taken the first lines in the tournament tables by weight categories 1, 3 and 10 TB of the database volume in the non-cluster (standalone) standings. Thus, we will consider that with in-memory OLAP everything is fine and move on to in-memory OLTP.
As already mentioned, Hekaton is not the code name for the next version of SQL Server, as were Denali, Katmay, Yukon, etc., but, in fact, a project to develop an in-memory engine, i.e. component of the product. This component is the most striking innovation not only in the current version, but also, possibly, on the scale of the entire product line, starting with the 16-bit Ashton-Tate / Microsoft SQL Server 1.0, which was released 25 years ago. Hekaton is a Greek word and means one hundred or self-one hundred = one hundred times, which kagbe hints that it’s not cooler than those with only ten. I would like to immediately warn against the misconception that the Hecaton is a kind of extended version of dbcc pintable, because work with a table attached to memory works like a regular disk, including execution plans, ensuring transactional integrity using locks, etc. Hekaton is a compact independent kernel integrated inside the SQL Server executive mechanism, characterized in comparison with the traditional database engine by the absence of interpreted execution plans, locks as a means of ensuring logical data integrity and latches for physical integrity. Let me remind you that latches (I don’t know how they are ideologically correctly translated into Russian, probably latches) are lightweight locks that are imposed on data pages, index pages, some service structures immediately at the time of their reading or changing in memory, unlike from locks that can operate throughout the transaction. Therefore, without going into details, we can assume that deadlocks do not exist. There are nuances, but not so painful. Another difference is that locks can be controlled (for example hints, isolation level). Latch is a purely run by SQL Server. I will not go into their internal structure in detail, those who wish can turn to BOL or to the pageEvgenia Khabarova . I will not bump into theory at all, let's better get down to examples.
Let's create a database and in it a file group under in-memory OLTP.
CREATE DATABASE hekaton_test
ALTER DATABASE hekaton_test ADD FILEGROUP fg_hekaton_test CONTAINS MEMORY_OPTIMIZED_DATA
ALTER DATABASE hekaton_test ADD FILE (name='fg_hekaton_test1', filename='c:\Temp\fg_hekaton_test1') TO FILEGROUP fg_hekaton_test
Script 1
This is a file-like file-group, data from memory will persist in it, and in the process of recovery. read accordingly. Like clustered columnstore, it consists of data files into which records inserted in the insert or update result are sequentially written and delta files in which identifiers of deleted records are stored. First, the changes, as usual, are reflected in the memory, and with checkpoint blocks of 256K (in the case of data) and 4K (delta) are dumped to the disk, which is noted in the transaction log. Merging data-delta pairs occurs automatically when a certain size is reached and depending on the size of RAM, and can also be done manually by the sys.sp_xtp_merge_checkpoint_files procedure. Read more about this process here .
Thus, the placement of tables in memory does not mean that if the server is chopped off, everything that is acquired by back-breaking labor will be lost. In-Memory OLTP is a fully transactional technology and supports fault tolerance, including AlwaysOn.
In a freshly created database, create a table optimized for working in memory.
use hekaton_test
CREATE TABLE dbo.ShoppingCart (
ShoppingCartId int not null primary key nonclustered hash with (bucket_count=2000000),
UserId int not null index ix_UserId nonclustered hash with (bucket_count=1000000),
CreatedDate datetime2 not null,
TotalPrice money
)
WITH (MEMORY_OPTIMIZED=ON)
Script 2
The last option just means that the table will be placed in memory. Each MEMORY_OPTIMIZED table must have at least one index. The total number must not exceed 8. The fields in the index must not be nullable. For index fields n (var) char, the BIN2 call must be used. There are no cluster indexes, by definition. The structure of the classic B-Tree, too. For tables in memory, indexes are HASH (better suited for point search) and RANGE (as the name suggests, better suited for range scans).
Core component of the hash indexserves as the so-called mapping table, in one column of which the results of applying the hash function to the concatenation of the fields forming the index key are stored, in the other - pointers to records. Since the hash function can give the same results (collisions arise) for completely different values of the argument (for close ones they must be different), these will in fact be pointers to the memory areas where the overflow chains lie. Each such chain is a bi-directional list. The main parameter when creating a hash index is bucket_count. This is the number of slots in the match table. The fewer they are, the higher the probability of collision, the longer the overflow chains will grow from each hash. Accordingly, it is obvious that it should be no less than the number of unique values in the index key. In fact, it is estimated as the number of unique values, rounded up to the next power of 2, andit explains in detail why.
The second type of index available in-memory is called range and is very similar to the classic clustered one. Its nodes form an ordered structure, effective for scans by ranges. Instead of B-Tree, its modification of Bw-Tree is used, the most striking difference of which, perhaps, is that it does not store pointers to duplicate values. If there are a million identical values in the table, the classical tree will stupidly keep a million pointers (on data) in the leaves. Bw manages in this case one that allows dramatically (English-speaking authors love this word very much) to save a place when stuffing this economy in memory. The only thing in this case is overflow chains again - we do not store pointers to all records, how to get to the next one with the same key value? There is a feeling that there is no saving, just bytes on ptr were transferred from leaves to bouquets. But no, read why this is not so, here , and we will move on and create another table.
CREATE TABLE dbo.UserSession (
SessionId int not null primary key nonclustered hash with (bucket_count=400000),
UserId int not null,
CreatedDate datetime2 not null,
ShoppingCartId int,
index ix_UserId nonclustered hash (UserId) with (bucket_count=400000)
)
WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_ONLY)
Script 3
Note the last option highlighted in bold. It means that the data in this table will not be saved between server restarts (the structure itself will remain). As I said above, there is a misconception about Hekaton that everything that is in memory is lost during a restart. So for these tables, this is indeed the case, but you create them absolutely consciously in order to reduce the overhead, in particular, for logging where it is not needed. This is a kind of analogue of temporary tables. By the way, table variables in the Hecaton are also supported. They are declared through the preliminary creation of the table type CREATE TYPE ... AS TABLE ... Unlike ordinary table variables, they are stored, of course, not in disk tempdb, but refer to the database where they were declared.
Tables located in memory do not support automatic updating of statistics, in particular, ALTER DATABASE ... SET AUTO_UPDATE_STATISTICS ON. Also
exec sp_autostats @tblname = 'ShoppingCart'
Index Name AUTOSTATS Last Updated
[ix_UserId] OFF NULL
[PK__Shopping__7A789AE57302F83B] OFF NULL
exec sp_autostats @tblname = 'ShoppingCart', @flagc = 'ON'
- Operations that require a change to does not work version, for example renaming, are not supported with memory optimized tables.
However, statistics can be updated manually: UPDATE STATISTICS dbo.ShoppingCart WITH FULLSCAN, NORECOMPUTE.
Generally, tables in memory have a bunch of limitations. Most table hints are not supported: No TABLOCK, XLOCK, PAGLOCK, ... Does not swear at NOLOCK, but also does not respond, as if it did not exist. Dynamic and keyset cursors silently translate into static. The TRUNCATE TABLE and MERGE statements are not supported (when the table in memory is the destination). There are restrictions on the types of fields used. You can read about them in detail here , but in order not to be upset, we will see what happened.
SELECT name, description FROM sys.dm_os_loaded_modules WHERE description = 'XTP Native DLL'
Script 4
In the SQL Server installation directory C: \ Program Files \ Microsoft SQL Server \ ... \ DATA \ xtp \ 11 \ two dlls appeared, which are called xtp_t_11_ <9 digits> .dll. These are our ShoppingCart and UserSession tables.
The built-in compiler converts T-SQL definitions of tables and stored procedures into C code (you can see it in the same directory), from which the machine code is obtained. The corresponding dynamic libraries are loaded into memory and linked inside the SQL Server process. When you restart SQL Server, the libraries are compiled and reloaded based on the catalog information from the metadata.
In the next part, we plan to consider the native compilation of stored procedures, transaction isolation levels in the Hecaton, locks, journaling, and overall performance compared to traditional disk objects.