Incremental garbage collector in Ruby 2.2

Original author: Koichi Sasada
  • Transfer
This article discusses the incremental garbage collector (incremental GC) introduced in Ruby 2.2. We call this algorithm RincGC. RincGC allows for a shorter pause time (GC pause time) than Ruby 2.1.

About the Author: Koichi Sasada works at Heroku with Nobu and Matz on the ruby ​​core. He wrote YARV , Generational Garbage Collector (RgenGC) for Ruby 2.1, as well as incremental GC for ruby ​​2.2 and this article.

Premise


Ruby uses GC to automatically collect unused objects. Thanks to the garbage collector, ruby ​​programmers do not have to delete objects manually and do not have to worry about bugs with such deletion.
The first version of Ruby already used the mark and sweep (M&S) algorithm. M&S is one of the simplest GC algorithms, which consists of two stages:
1. Mark: walk through all living objects and mark them as “living object”
2. Sweep: dispose of all untagged objects, since they are no longer used

M&S based on the knowledge that all found objects from among living objects are living objects. The M&S algorithm is very simple and therefore works very well.

image
Mark & ​​sweep GC algorithm

This simple and efficient algorithm (and the conservative method of garbage collection) makes writing extensions in C quite easy. As a result, Ruby has many useful extensions. However, because of this algorithm, it is difficult to apply moving GC algorithms such as compaction and copying.

Writing C-extensions is not so important right now, because we can use FFI (foreign function interface). But in the beginning, having a large number of extensions and providing a lot of features through C-extensions was a big advantage, which made the Ruby interpreter more popular.

Despite the fact that the M&S algorithm is simple and works great, there are several problems. The most important potential problems are throughput and pause time. GC slows down your Ruby program due to garbage collection overhead. In other words, low GC performance increases the total execution time of your application. Every garbage collection pauses your application. Long pauses affect UI / UX web applications.

To solve the performance issue, Ruby 2.1 introduced a generational GC garbage collector. The Generational GC splits the heap space into several parts for several generations (in Ruby, we divide the heap space into: young and old space). Newly created objects are located in the "young space" and are accordingly marked as "young objects". After young objects survive several garbage collections (3 in Ruby 2.2), they go into the category of “old objects” and will be in the “old space”. We know that in object-oriented programming, most objects die young. Therefore, we need to start garbage collection only for the "young space". If there is not enough space in the young space to create objects, then the garbage collector for the "old space" starts. We call “Minor GC” when the garbage collector works in a young space, and “Major GC” for everyone: both for young and old spaces. We implemented the generational GC algorithm with some modifications and named our algorithm and garbage collection implementation “RGenGC”. You can find out more details by lookingmy presentation and slides on EuRuKo.

RGenGC significantly increases garbage collection performance due to the very fast Minor GC. It is worth noting that Major GC pauses the program for a long time and this time is equal to the length of the pause in Ruby 2.0 and earlier versions. Most garbage collections are performed by Minor GC.

image
Pauses in Major GC and Minor GC

To solve the problem with a long pause, an incremental garbage collector is best suited.

Incremental garbage collection


The incremental garbage collection algorithm divides the build process itself into several small processes and alternates between GC processes and Ruby processes. Instead of a long pause, the incremental garbage collector produces many short pauses. The total duration of the pauses will remain the same (or slightly longer due to the overhead when using the incremental garbage collector), but each individual pause will be shorter. This allows for more stable performance.

Ruby 1.9.3 introduced the “lazy sweep” GC, which reduces the duration of pauses in the sweep phase. The meaning of laze sweep is to start the sweep phase not immediately, but step by step. Lazy sweep reduces the duration of individual pauses and is half the incremental GC algorithm. Now we need to make the Major GC incremental work phase.

Let’s introduce three concepts to explain the process of incremental marking of objects: “white object” - unlabeled objects, “gray object” - labeled, but can refer to white objects, “black object” - tagged, but do not indicate any white object.

Using these three colors, we can explain the “mark and sweep” algorithm as follows:
1. All existing objects are marked as white
2. Explicit live objects, such as objects on the stack, are marked as gray.
3. Select any gray object and mark the object to which it refers also gray. Change the color of the original object to black. We repeat until there are no gray objects, but only black and white.
4. We collect white objects, since all living objects are painted black.

To make the whole process incremental, it is necessary to make step (3) incremental. The plan is this: select some gray objects and gray out the objects to which they refer, then switch to the execution of the Ruby code, then return to the marking process, etc.

image
Regular marking process (STW: stop the world) versus

incremental marking There is one problem with the incremental marking process of objects. Black objects can point to white during ruby ​​code execution. This is a problem, since a black object by definition cannot refer to white objects. To prevent this, we use a “write-barrier” to detect the creation of such a link of a black object to white.

For example, an array object 'ary' is already marked as black.
ary = []
# GC запускается и помечает чёрным

The object 'obj = Object.new' will be white if we execute this code.
ary << obj
# после создания obj GC не запускается

Now the black object refers to the white object. If there are no gray objects referencing 'obj', then 'obj' will be white at the end of the stage of marking objects and accordingly disposed of by mistake. Collecting all living objects is a gross mistake and must be avoided.

A write barrier is invoked each time an object begins to reference a new object. The recording barrier determines when a link was created from a black object to white. When this happens, the black object changes its color to gray (or gray with a pointer to a white object). Recording barriers completely solve the problem of collecting all living objects.

This is the main meaning of the incremental algorithm. As you can see, this is not so difficult. You may have a question: “why is Ruby still not using this simple GC algorithm?”

Incremental garbage collector in Ruby 2.2


When implementing the incremental tagging process in the Ruby interpreter (CRuby), a serious problem arises - the lack of write barriers.

The Generational GC, which was implemented in Ruby 2.1, also needed write barriers. To implement generational GC, we invented a new method - the “write barrier unprotected objects”. This means that we divided all objects into protected and unprotected. In this way, we can guarantee that all referenced protected objects are under control. We cannot control links from insecure objects. With the introduction of the concept of an unprotected object, you can implement generational GC in Ruby 2.1.

We can also properly implement incremental GC using unprotected objects:
1. Color all existing objects in white
2. Color all clearly living objects in gray, including objects on the stack
3. Take one gray object and paint in gray all the objects to which it refers. Change its color to black. Repeat until there are no gray objects. Only black and white. This phase is carried out in stages.
4. We collect white objects, because all living objects are black.

So we can guarantee that we do not have white living objects.

image
Rescanning from the write barrier of unprotected objects (WB unp.) At the end of the tagging process

Unfortunately, the 4th stage can create a long pause, which we hope to avoid. However, the total pause time is associated with the number of recording barriers of live unprotected objects. Most objects in Ruby are String, Array, Hash, or simple (pure) objects created by the programmer. They are protected objects. In practice, pauses for the barrier to recording unprotected objects do not create any problems in most cases.

We did an incremental tagging process only for major GC, as no one complains about pauses in minor GC. The maximum pause duration in our incremental GC is less than in minor GC. If you are satisfied with the break times in minor GC, then you do not need to worry about break times in major GC.

I also applied a trick to implement incremental GC in Ruby. We have a set of “black and insecure” objects. In order for the garbage collector to work quickly, we created a “insecure” bitmap, which is an unprotected object, and a separate “tagged” bitmap, which shows which objects were marked. Using a logical product, we can find “black and insecure” objects.

Incremental GC Pause Duration Estimation


In order to measure the duration of pauses in the process of garbage collection, we will use gc_tracer . Gc_tracer has a GC: Tracer module, which allows you to track parameters related to the garbage collection process. gc_tracer outputs each such parameter to a file.

Garbage collection includes the following events:
start
end_mark
end_sweep
newobj
freeobj
enter
exit

As I described above, GC in Ruby has two phases: “mark” and “sweep”. The “start” event means the start of the mark phase, and “end_mark” means its completion. The end_mark event also marks the start of the sweep phase. Obviously, "end_sweep" indicates the end of the sweep phase and also means the completion of the garbage collection process.

“Newobj” and “freeobj” are events when allocating memory for an object and freeing it.

We use “enter” and “exit” events to measure the duration of pauses. Incremental GC (incremental marking and lazy sweeping) uses the suspension of the mark phase and sweep phase. "Enter" means "entering GC related event". Finally, “exit” means “exitting GC related event”

The following figure shows the distribution of events over time in the current incremental GC:

image

We can measure the current time (in Linux, the current time is the result of calling gettimeofday ()) for each event. Thus, we can measure the duration of pauses in the GC using the events “enter” and “exit”.

I am using ko1-test-appto compare performance. ko1-test-app is a simple Rails application written for me by Aaron Patterson.

To use gc_tracer jam, I added the rake rule "test_gc_tracer":

diff --git a/perf.rake b/perf.rake
index f336e33..7f4f1bd 100644
--- a/perf.rake
+++ b/perf.rake
@@ -54,7 +54,7 @@ def do_test_task app
   body.close
 end
-task :test do
+def test_run
   app = Ko1TestApp::Application.instance
   app.app
@@ -67,6 +67,22 @@ task :test do
   }
 end
+task :test do
+  test_run
+end
+
+task :test_gc_tracer do
+  require 'gc_tracer'
+  require 'pp'
+  pp GC.stat
+  file = "log.#{Process.pid}"
+  GC::Tracer.start_logging(file, events: %i(enter exit), gc_stat: false) do
+ test_run
+  end
+  pp GC.stat
+  puts "GC tracer log: #{file}"
+end
+
 task :once do
   app = Ko1TestApp::Application.instance
   app.app

And ran bundle exec rake test_gc_tracer KO1TEST_CNT = 30000 . A value of 30000 means that we will simulate 30,000 requests. The results will be written to the log.xxxx file, where xxxx is the application process id. The file should look something like this:

type  tick  major_by      gc_by   have_finalizer  immediate_sweep state
enter   1419489706840147      0     newobj  0     0     sweeping
exit  1419489706840157      0     newobj  0     0     sweeping
enter   1419489706840184      0     newobj  0     0     sweeping
exit  1419489706840195      0     newobj  0     0     sweeping
enter   1419489706840306      0     newobj  0     0     sweeping
exit  1419489706840313      0     newobj  0     0     sweeping
enter   1419489706840612      0     newobj  0     0     sweeping
...

My file has 1,142,907 lines.

The “type” column contains the name of the event, “tick” - the current time (result of gettimeofday (), the number of microseconds from the beginning of the era). We can see the length of the pause using this information. Using the first two lines above, we can measure the length of the pause: 10 μs (1419489706840157 - 1419489706840147).

The following small script shows the duration of each pause:

enter_tick = 0
open(ARGV.shift){|f|
  f.each_line{|line|
  e, tick, * = line.split(/\s/)
  case e
  when 'enter'
    enter_tick = tick.to_i
  when 'exit'
    st = tick.to_i - enter_tick
    puts st if st > 100 # over 100 μs
  else
    # puts line
  end
  }
}

There will be many lines in the log file, since this script prints the duration of pauses every 100μs.

The following figure shows the result of the measurements:

image

We see that the generational GC has 7 huge pauses. This should be pauses caused by the launch of major GC. The maximum pause time is approximately 15ms (15Kμs). However, incremental GC reduces the maximum pause time to 2ms (2Kμs). Excellent.

Conclusion


Ruby 2.2 uses an incremental garbage collection algorithm to reduce pause time for garbage collection.

Keep in mind that incremental GC is not a "silver bullet." As I already described, incremental GC does not affect performance. This will not affect the response time if the request is too long and calls the major GC several times. Total garbage collection time will not be reduced due to incremental GC.

Also popular now: