I don't think it's a useful idea to use a thread or process pool to accept on the same passive socket. It's perfectly fine to have one thread doing this and handing the connected socket off to a pool for processing. If your bottleneck is in the accept loop, what that proves is that you're not writing a real service application, but a benchmark contrived to produce such a bottleneck.
Came here to say this. And if you do have one of those rare applications where the time spent in accept rivals the time spent doing useful work, then have one thread accepting per CPU. Not 10k.