Are you sure that's true? Perhaps they have reserved the ability to do that in the future (can you point to doc on that? ), but right now I'm fairly certain that go does no such thing. The only copies necessary when calling C code are for strings since they aren't guaranteed / required to be null terminated byte arrays in Go. And cgo has C ABI compatibility so function call overhead is non existent.
Function call overhead in cgo is considerable. You're right that it's not from copying, but the runtime scheduler still has to coordinate the blocking call, and the stack needs to be switched out to the C stack, kind of like a context switch.
True, just benchmarked it and found the function call overhead to be about 1.85834729e-7 seconds (185 ns). Which isn't much, but the pure C version would obviously be single nanoseconds for the handful of instructions needed depending on the function call.