I would hope that your production hardware matches your developer / staging / testing hardware.
Lets say production is 50% slower than what's tested in staging / developer test cases. Is it the data in production that causes this performance loss? Or is it hardware differences?
If you are using Intel tools to debug performance problems on developer / testing stages, you probably want to keep using those Intel tools in staging / production. There are enough cache differences and instruction-level differences (speed of "division" instruction. PEXT vs PDEP. Cache differenches, branch predictor differences, TLB differences) between the chips.
Intel has interesting optimizations: an Intel Ethernet card drops the data off in L3 cache (bypassing DDR4 RAM entirely). These little differences in the driver / motherboard / CPU can have a huge difference in performance, and complicate performance testing / performance debugging.
If you are deploying to AMD hardware for production, you probably want to be running AMD hardware in testing / developer stages as well. You want all your hardware performing as similarly as possible.
From my understanding, that vulnerability exists only if RDMA is also enabled.
RDMA, the ability to share RAM as if it were local RAM (through a memory-mapped IO mechanism) across Ethernet is not a common setup. The fact that you can perform cache-timing attacks over RDMA + Intel L3 cache is a testament to how efficient the system is if anything.
Consider this interpretation: RDMA + DDIO is so fast, you can perform cache-timing attacks over Gigabit Ethernet(!!). NetCAT (the "vulnerability" you describe) is proof of it.
Cache-timing / side channel attacks aren't exactly the kind of vulnerabilities that most people think of though. Its kinda cool, but its nothing as crazy as Meltdown / Spectre were.