Generational Garbage Collection in Firefox

sorentwo · on Sept 25, 2014

# Firefox

Sunspider: 182.5ms +/- 4.6% Octane: 21148

# Chrome Sunspider: 206.7ms +/- 1.2% Octane: 24921

I adore Mozilla and use Firefox as my primary browser. As a web developer I hope those benchmarks get tighter.

nnethercote · on Sept 26, 2014

Sunspider is an ancient, awful benchmark suite that has been optimized to the point of insanity. For example, every major browser has a daylight savings offset cache that is no use to any real code but speeds up Sunspider significantly. It needs to die. You should ignore it.

Octane is better, though still flawed: https://blog.mozilla.org/nnethercote/2012/08/24/octane-minus....

The state of browser benchmarking isn't good. Here are my ideas on how to fix it: https://blog.mozilla.org/nnethercote/2014/06/16/a-browser-be.... The political/organizational challenges are as big or bigger than the technical challenges.

And really, you should care most about how each browser performs on the workloads you are interested in.

aidenn0 · on Sept 26, 2014

> The state of browser benchmarking isn't good.

You can delete the word "browser" from that sentence and it's still true. Just one example: You wouldn't believe how many large projects rely on Dhrystone for benchmarking and selecting silicon.

nnethercote · on Sept 26, 2014

I can believe it. But at least in the C/C++/Fortran world there are decent suites, e.g. SPEC, even if not everybody uses them. The browser world doesn't even have that.

cromwellian · on Sept 26, 2014

I think the brouhaha over Mandreel vs Emscripten glosses over a deeper issue, which is, the vast majority of Web apps aren't C++ compiled to asmjs, and performance on asmjs is not going to be a good proxy for general web performance.

"Real apps" using idiomatic JS should be used as benchmarks, and those apps should approximate apps in widespread deployment. For example, how fast can a reactjs re-render for a complex page run?

pcwalton · on Sept 25, 2014

Use http://arewefastyet.com/ for the most recent benchmarks. The results on there are rather different from what you quoted.

rcthompson · on Sept 25, 2014

Should GGC have a major impact on overall execution time? Or is it more about reducing "jank", or whatever they're calling it these days?

sphink · on Sept 25, 2014

It has some impact on overall execution time (throughput), mostly through being able to bump allocate. The benchmark I used in the article shows a large gain, though the real-world effect is much less.

But GGC is mostly about reducing pause times (latency). (And yes, people have taken to calling that "jank". I still resist the term when I can, since it's no more precise than anything that came before but people seem to think it must mean something specific.)

cbd1984 · on Sept 26, 2014

Somewhat off-topic, maybe, but is anyone else here experiencing massive slowdowns in Chrome and Chromium for Linux?

It's unusable to me at this point: The whole thing pegs one core of my dual-core system just switching to a new tab, even though plenty of RAM is available. It pegs my CPUs and my disk on startup, but that's been a problem for quite a while now.

In both cases, I'm using the latest in the Ubuntu repos.

pdknsk · on Sept 25, 2014

Googlers at I/O explain how this works in V8. Click on "View the presentation" and then from slide 23 onwards.

https://developers.google.com/events/io/sessions/325547004

sphink · on Sept 25, 2014

The main difference from what is shown in those slides is that V8 uses a semispace collector. The SpiderMonkey collector just has a single nursery.

Jon Coppeard implemented a semispace collector for SpiderMonkey, but the added complexity made it a net loss in performance. So we scrapped it for now. It means we get a few objects unfairly tenured, but our measurements showed the actual number was pretty low and not worth the overhead.

It's totally workload dependent, and further GGC tuning (there's a lot to go!) may reverse that balance.

radarsat1 · on Sept 28, 2014

Is there any work on using online methods to optimize tenure threshold for a given workload? (E.g., system notices that similar objects are being repeatedly tenured and then die shortky after, adjusts the threshold to keep them in the nursery.)

mwfj · on Sept 25, 2014

It must be depressing for them to be in catch-up mode for so long.

VerGreeneyes · on Sept 25, 2014

It doesn't help that splay-latency is literally nonsense (improving what it measures would do nothing to make the browser more responsive). If it wasn't for that benchmark they would be ahead right now.

sphink · on Sept 26, 2014

I agree. If you accept the reasonable premise that incremental GC trades worse throughput for better latency, then splay-latency rewards low throughput.

That isn't as awful as it sounds, it's just that there's nothing in the benchmark that tells the JS engine that penalizing throughput is the right thing to do. It needs some kind of marker that we can agree means "even though it's the wrong thing to do given just the code that you're seeing, pretend like this is running in an environment where you should prioritize throughput below latency between these semi-arbitrary points." We are discussing perhaps treating Date.now or window.performance.now as meaning that, because if you're measuring the jitter between things, you'll be grabbing the current time at exactly those points where you're mimicking ending one turn and starting the next. But that's still not really correct, because you're also asserting that there would be zero idle time in between turns, which is generally not true in a real application.

It's a mess.

agumonkey · on Sept 25, 2014

Then they have the secret to manage depression because they've been catching up for so long it's a miracle.