Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The golang object pool is a bit of a problem (compared to the JVM alternatives) due to lack of generics. You tend to need to do object pooling when you have tight performance requirements which is at odds with the type manipulation you have to do with the sync.Pool.

So the golang pool is good for the case where you have GC heavy but non-latency sensitive operations, but not the more general performance sensitive problems.



Have you benched "myPool.Get()" vs "myPool.Get().(*myStruct)" ? I don't think the "type manipulation" is the problem you think it is.


At the x86 level, myPool.Get(Index) is going to be at least as expensive as cmp/jae/mov (3 cycles), and myPool.Get().(myStruct) is going to be at least as expensive as cmp/jae/cmp/jne/mov (5 cycles). So unless you have some way of hiding the latency, the type check is 67% slower by cycle count.

The experience of every JIT developer is that dynamic type checks do matter a lot in hot paths.


Not disagreeing, but I think that was a bit inaccurate.

If that branch is mispredicted, we're talking about 12-20 cycles. Ok, I assume it's a range check and thus (nearly) always not taken. So if it's in hot path, it'll always be correctly predicted. Modern CPUs will most likely fuse cmp+jae into one micro-op, so predicted-not-taken + mov will take 2 cycles (+latency).

"cmp/jae/cmp/jne/mov" will of course be fused into 3 micro-ops. But don't you mean "cmp/jae/cmp/je/mov"? I'm assuming second compare is a NULL check (or at least that instructions are ordered that way second branch is practically never taken). I think that also takes 2 cycles (both branches execute on same clock cycle + mov), but not sure how fused predicted-not-takens behave.

L3 miss for that mov, well... might well be 200 cycles.


Ah yeah, I wasn't sure if fusion was going to happen. You're probably right in macro-op terms; sorry about that.

The first compare is a bounds check against the array backing the pool, and the second compare is against the type field on the interface, not a null check. Golang interfaces are "fat pointers" with two words: a data pointer and a vtable pointer. So the first cmp is against a register, while the second cmp is against memory, data dependent on the register index. The address of the cmp has to be at least checked to determine if it faults, so I would think at least some part of it would have to be serialized after the first branch, making it slower than the version without the type guard.


> Ah yeah, I wasn't sure if fusion was going to happen.

Well, I didn't profile that case. Who knows what will really happen. Modern x86 processors are hard to understand.

> ... while the second cmp is against memory, data dependent on the register index

Hmm... that sounds like something that would dominate the cost? Memory access and data dependency. Ouch.

Also of course in that case, second compare+branch can't be fused, because cmp has a memory operand.


This is a small fixed cost as compared to the cost of not using the pool. I wasn't suggesting the operation is free.


But not using a pool is not the alternative we are talking about. Rather its hand rolling your own every time. Something the JVM doesn't require.


>Rather its hand rolling your own every time.

So your had rolled one will not have the typing overhead we are discussing, but it will have 2 much worse issues.

  1. sync.Pool's have thread local storage, something your own pools will not have.
  2. sync.Pool's are GC aware; meaning if the allocator is having trouble it can drain "free" pool objects to gain memory. Your custom pool will not have this integration.
I have a feeling that the performance you gained not type-checking you will loose by not having #1.


I think you are missing my point. So I'll restate it. sync.Pool does not help with GC issues compared to the JVM because the JVM also has object pools, further those object pools are actually better for the low latency case because the language does not force them to make a choice between dynamic type checks and specific use abstractions.

[edit] As pcwalton points out. My whole argument is actually null and void due to type erasure...doh.


To be fair, though, aren't generics on the JVM type-erased? So you're going to have a type assertion at the JIT level either way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: