Nothing is magic but in this case moving from a language with a GIL to a language built for (single machine) concurrency probably scales a lot.
The OP says as much in the comments:
"For us, one of the biggest latency wins comes from the fact that go can truly execute sql statements in parallel (whereas python's GIL serialized these parallelizable operations). In general, single-threaded go is at least 5x faster than pure python (without c-module)."
The OP says as much in the comments: "For us, one of the biggest latency wins comes from the fact that go can truly execute sql statements in parallel (whereas python's GIL serialized these parallelizable operations). In general, single-threaded go is at least 5x faster than pure python (without c-module)."