Well the underlying thing I am doing is small request/reply messages - I'm doing metadata lookup for ip addresses. The way I sped things up with zeromq was first by batching requests. Essentially, if I have 10k lookups to do, instead of sending 1 at a time, I group them into blocks of 100 and send
' '.join(block)
Then I do all the lookups on the server and send a block of responses back. This turns what would be 10k queries into only 100 rpc calls.
That got me to about 60k lookups a second locally, but over a wan link that dropped down to 10k. I fixed that by implementing pipelining using a method similar to the described under http://zguide.zeromq.org/page%3Aall#Transferring-Files where I keep the socket buffers busy by having 10 chunks in flight all the time.
That got things to 160k/s locally and 100k+/sec even over a slow link.
I'll have to mess with grpc a bit more. Looking at my grpc branch it looks like I tried using the request_iterator method first, then I tried a regular function that used batching, but I didn't try using request_iterator with batching. I think the biggest difference would be if request_iterator uses a pipeline, or if it still only does one req/reply behind the scenes.
That got me to about 60k lookups a second locally, but over a wan link that dropped down to 10k. I fixed that by implementing pipelining using a method similar to the described under http://zguide.zeromq.org/page%3Aall#Transferring-Files where I keep the socket buffers busy by having 10 chunks in flight all the time.
That got things to 160k/s locally and 100k+/sec even over a slow link.
I'll have to mess with grpc a bit more. Looking at my grpc branch it looks like I tried using the request_iterator method first, then I tried a regular function that used batching, but I didn't try using request_iterator with batching. I think the biggest difference would be if request_iterator uses a pipeline, or if it still only does one req/reply behind the scenes.
I'm sure one thing that doesn't help is that
Ends up as a lot more overhead than doing ' '.join(batch)