To Sleep, Perchance to Dream: Rails Ruby Bench and Sleepy GC

Hey, folks! It's been a few weeks since my last post about Rails Ruby Bench, so let's talk about some things you don't see it do, but it does behind the scenes! We'll also talk about an interesting new performance change that may be coming to Ruby 2.6.

That change is Eric Wong's Sleepy GC bug report and patch. With SleepyGC, Ruby will garbage collect with spare (idle) cycles. If you're just here for the latest Ruby development news, skip this post and click the bug report. Original sources are always more compete than commentary, right?

(Want to skip the narrative and go straight to the upshot? Skip to the bottom -- look for "So Did It Work?")

A Little Story

A few days ago, the excellent Sam Saffron of the Discourse team asked for my opinion on a pending Ruby speed patch. Yay! Rails Ruby Bench exists for this exact purpose: when a new Ruby patch comes out, I check how much it speeds up Rails (or slows it down.) And then you all know!

Of course, just lately I've been working on scaling out the benchmark itself, as my Ruby coordinator post suggests - right now, even if Discourse scales up just fine, my benchmark tops out at an EC2 m4.2xlarge instance. Above that I'm not configuring enough connections to Postgres, so I can't run enough threads and processes to use all that capacity. Working on it!

As a result, I haven't been constantly running RRB on the latest head of Ruby, because I've been working on other stuff. Which means my results are a bit out of date. Ruby also keeps getting faster, and the speedups keep getting smaller. This should make sense -- you've seen my "look, faster Ruby!" numbers, and it keeps getting harder to get large speedups after the last few hundred large speedups. Which means I need to crank up the number of requests per run and the number of runs per batch to keep pace. The Ruby core team are good at what they do! At the moment my "quick, rough check" numbers are 10,000 HTTP requests per run and 30 runs per batch (that's 200k HTTP requests) for reference, and that doesn't catch really small differences! And that still gets occasional outlier runs, so that's definitely not enough to check, "hey, does this sometimes cause random slowdowns?"

When I checked, things were a little broken. There was a slight speed regression for late March and early April Rubies, and a slightly older version (around May 1st) wouldn't run Rails Ruby Bench at all - the requests just didn't return.

So here's a "thanks for reading" takeaway for you -- don't run your production infrastructure on untested, non-release Rubies from random dates in the repository. ;-)

But here's another: the speed regression didn't last. Even when they're mostly testing on non-Rails code, mostly the Ruby core team do a great job keeping everything in a good shape - small problems tend to be caught and found rapidly, even in the long gaps between releases and previews.

Eventually I found a working Ruby, got a nice stable Rails Ruby Bench performance baseline just before the Sleepy GC patch, and ran a big batch of tests on Ruby 2.6.0 preview 1, the Ruby right before Sleepy GC, and Sleepy GC version 3 from Eric Wong's repository.

Wait, What's Sleepy GC?

Normally Garbage Collection (GC) runs when you've allocated a lot of memory, or when your process is running low and needs more. In other words, normally you reclaim old unused memory when you need memory -- and not before. You can manually run garbage collection before that in most languages (including Ruby) but that's not especially common.

It can be hard -- or impossible -- to avoid random pauses in your program if you use garbage collection. That's one reason that GC tuning is such a big deal in the JVM, for instance. Random pauses aren't necessarily a problem for every workload, but ask a game programmer about GC some time and you'll see what's wrong with them!

Ruby normally has "idle" times, such as when it's waiting for a file to be read, or network packets to arrive, or a database query. There can also be idleness from explicit sleeps or delays if the Ruby process is trying not to use more CPU than necessary. In all of these cases, it may make sense for the garbage collector to do some of its work in the idle time rather than making your program wait when you need memory.

Of course, if your Ruby process has lots of threads then you may already be filling this idle time with other work.

So Did It Work?

The short version is: the current Sleepy GC doesn't do anything for Rails Ruby Bench. If you think for a second, this should make sense - RRB runs a giant concurrent workload flat-out from startup until shutdown, overloaded with threads so that every CPU is running Ruby code constantly. There are no unfilled idle cycles. So Sleepy GC neither speeds up nor slows down RRB detectably -- which is a win, if it speeds up other workloads. Sam Saffron suggests it may do well for Unicorn servers, for instance. That makes sense - Unicorn runs one thread per process, so it may have lots more idle time than a heavily-multithreaded Puma workload like RRB. Sleepy GC may be useful, but RRB is a terrible way to find out one way or the other. That's fine. No benchmark shows you everything you care about, and it's important to know which is which.

While from my viewpoint, it was a great success! I have determined to my own satisfaction that there aren't lots of idle cycles for GC that I'm not capturing, so RRB did what it should have!

If you have a workload that you think may benefit from Sleepy GC, you can also try it out yourself. Sam Saffron says it helps certain Postgres workloads quite a lot, for instance. As of this writing, the latest branch is "git://80x24.org/ruby.git" on branch "sleepy-gc-v3". But read the bug report for the latest, always.