I don't think there is back and forth anymore. Actual high performance research (e.g. when the cost of a single mutex is more than the whole CPU budget for processing something, like say a packet) has been devoid of threads since they got into the mainstream, so like for almost two decades already. They are still used, because this is what hardware and OS provide to do something on each core, but not for concurrency or performance.
When your per-request time is so short, using events is easy. You don't have to worry about tying up a poller thread. (And yes, even event-based servers use threads if they want internal concurrency.) But that's a specialized domain. If requests take a substantial amount of processing or disk I/O time, naive approaches don't work. You can still use events, in a slightly more sophisticated way, if literally everything below you does async reasonably well, but any advantage over threads is much less and sometimes the situation still reverses. I work on storage servers handling multi-megabyte requests, for example, and in that milieu there is still very much back and forth.
Sure, if you are devoting the whole computer to a single microbenchmark, threads a terrible idea. This is not necessarily the case when you have many heterogeneous applications running on a machine, though.
Sure, but they still use the same kernel scheduling that threads do, and careful optimizations relying on core count = thread count are going to be basically worthless as well.
I mean the domain you're in changes your requirements drastically. Heck certain workloads (high frequency, low latency, nearly no concurrency) might run better on a single core of an overclocked i7 vs any Xeon processor.
If it's a web server then obviously you want more cores, and an event driven server would make a lot of sense.
Basically if you need concurrency then you want events, if you're compute bound than you don't want that overhead.
EDIT: Instead of i7 just imagine any high end (high frequency and IPC) consumer chip