Oracle In-Memory Option Performance Testing Using TPC-H Benchmark

One of the key innovations of Oracle DBMS version 12.1.0.2 is the In-Memory option. Its main idea is that for the selected tables you can easily activate the dual-format mode, which combines the standard Oracle DB row-wise format for storing data on disk and the batch format in RAM.

The corresponding conversion and duplication of data into memory occurs automatically. For me personally, this was great news, as I have been developing data warehousing (DWH) and have had experience working with the column-oriented DBMS Sybase IQ and HP Vertica, which are designed for storage and analytics. And Oracle offered the Column Store plus In-Memory plus all the features of your favorite DBMS! In fact, with this solution, Oracle entered the market of analytical in-memory databases (whoever has not read, I recommend an excellentarticle on Habré with comparison of databases of this class). Oracle's idea is very promising, but in practice , the results on my test cases , unfortunately, were not impressive. It was last year and I decided to wait until the technology was improved. After the release of the next patch with In-Memory Option improvements, I returned to this question. A more objective test was chosen for the article, which readers can repeat if desired.

Before moving on to my benchmark, here are a couple of links. In an article on Habré on a Oracle company’s blog, a detailed description of In-Memory Option and its fantastic results. In another article of the same blog, a performance test is given, but only 2 tables and a couple of simple queries are used.

TPC-H Benchmark


For a performance test, I used tpc-h benchmark , which is used to compare the performance of analytical systems and data warehouses. This benchmark is used by many manufacturers of both DBMS and server hardware. There are many results on the tpc-h page , for the publication of which it is necessary to fulfill all the requirements of the specification on 136 pages. I was not going to publish officially my test, so I did not strictly follow all the rules. Also, to simplify the testing process, I used the free version of Benchmark Factory for Databases .

TPC-H allows you to generate data for 8 tables using the specified scale factor parameter, which determines the approximate amount of data in gigabytes. I limited myself to 2 GB, as the free version of Benchmark Factory no longer allows. The total number of rows in the tables:
TableNumber of lines
H_LINEITEM11,996,782
H_ORDER3,000,000
H_PARTSUPP1,600,000
H_PART400,000
H_CUSTOMER300,000
H_SUPPLIER20,000
H_NATION25
H_REGION5


The test includes 22 SQL queries of varying complexity. Compared runtimes with and without In-Memory. The following load was generated: 8 virtual users in parallel 3 times in a circle execute all 22 requests. As a result, the execution time of 528 SQL queries was estimated.

For those who find this test not difficult enough, I recommend paying attention to another more recent Benchmark - TPC-DS . It has more tables and significantly more queries - 99.

Testing stand


A laptop with the following features:
- Intel Core i5-4210 CPU 1.70GHz - 4 cores; DDR3 16 Gb; SSD Disk.
OS:
- MS Windows 8.1 x64
DBMS:
- Oracle Database 12c EE 12.1.0.2.0
- Interim patches (1): "WINDOWS DB BUNDLE PATCH 12.1.0.2.160531 (64bit): 23179016"
DB Memory Configuration:
- memory_target = 10G ;
- sga_target = 8G;
- inmemory_size = 3G;

Setting In-Memory (IM) Settings


In addition to setting the database parameter inmemory_size, it is enough to specify the data of which tables or their parts should be duplicated in the IM cache, Oracle will do the rest for you. Thus, it is very easy to transfer an existing database to IM if there is enough RAM. There is no need to rewrite anything; you can only delete indexes that are not needed for IM tables. I also note the stable operation, I have not encountered a single bug related to IM.

In my test, all the tables went entirely to IM:

ALTER TABLE MY_TABLE_NAME INMEMORY MEMCOMPRESS FOR QUERY HIGH PRIORITY CRITICAL;

  • MEMCOMPRESS FOR QUERY HIGH is an option optimized for query performance and memory saving (there are 5 other options that can be read about in the documentation).
  • PRIORITY CRITICAL - determines the priority of replication to the IM cache.

Another important nuance is that the data in the columns compresses well, which is what Oracle does. The following query shows the amount of data on the disk, in IM, and the compression ratio:

select 
  SEGMENT_NAME,
  ROUND(SUM(BYTES)/1024/1024/1024,2) "ORIG SIZE, Gb",
  ROUND(SUM(INMEMORY_SIZE)/1024/1024/1024,2) "IM SIZE, Gb",
  ROUND(SUM(BYTES)/SUM(INMEMORY_SIZE),2) "COMPRESS RATIO"
from V$IM_SEGMENTS
group by SEGMENT_NAME
order by 2 desc;

SEGMENT_NAME
ORIG SIZE, GB
IM SIZE, GB
COMPRESS RATIO
H_LINEITEM
1.74
0.67
2.62
H_ORDER
0.39
0.35
1,1
H_PARTSUPP
0.12
0.08
1,58
H_PART
0.06
0.02
2.96
H_CUSTOMER
0.04
0,03
1.42
H_NATION
0
0
0.22
H_SUPPLIER
0
0
0.89
H_REGION
0
0
0.22

Test results


# 1 No In-Memory# 2 In-Memory
Elapsed time7 minutes 23 sec6 minutes 26 sec
Avg. Response Time (sec)5.6174.712

Finally


I do not think that any categorical conclusions can be drawn from the results of any one test. The results can vary greatly depending on the models and volumes of data, the specifics of the queries, the configuration of the DBMS parameters, as well as the hardware. As an alternative example, I’ll give a link to the once-sensational benchmark, where Oracle compared the performance of Oracle IM (on Exadata + Exalogic) and SAP HANA. Used SAP BW-EML Benchmark. In this test, the Oracle hardware and software system was at its best.

If you have experience using Oracle In-Memory, I will be glad to read about it in the comments.

Also popular now: