I was getting better performance out of the NVidia HPC SDK compilers, but then again, the old PGI compilers it is based upon (with an LLVM backend now), have always been my go-to for higher performance code.
I've got some Epycs and Zen2s at home here, and I have both compilers. Haven't done testing in recent months, but they've been updating them, so maybe I should look into that again. Thanks for the nudge!