> Unlike other software, RPCS3’s PPU & SPU threads need to communicate constantly which results in a major bottleneck if these threads are split across multiple CCXes / chiplets. That ends up with the CPU hitting this bottleneck constantly with all the data moving around. This is why we do not recommend Ryzen CPUs unless they have a 3 or 4 core CCX design (6-8 core Ryzen CPUs, or a 4 core Ryzen APU). A 4 core CCX design is ideal as RPCS3 can fit all the PPU & SPU threads onto a single CCX, allowing users to bypass inter-CCX latency bottleneck entirely, provided the PPU & SPU threads are being scheduled properly to be placed on a single CCX.
> While later Ryzen generations have greatly improved latency, it’s still a major bottleneck for RPCS3 if all the PPU & SPU threads cannot be placed on a single CCX. Another thing to note is that Ryzen users should definitely update to Windows 10 1903 or later, as Microsoft improved their scheduler which helps to avoid this bottleneck as well.
> The Intel CPUs on the other hand are quite the opposite. They do not suffer from the latency issues explained above due to its monolithic design and when you combine all that with a faster single core, you will notice that they do often perform better than their AMD equivalent.
Still charging a king's ransom, too. Looking on Amazon right now, the i9-9900K (which reached 32.3 FPS) costs 422.59 USD, compared to R7 3700X (which reached 28.0 FPS), which costs 329.99 USD.
That's a nearly 2:1 ratio for performance/price scaling.
I read the article, no need to be rude. The article is about maximizing memory throughput. It is not unusual to need multiple threads to actually saturate the memory controller(s). And in fact, RPCS3 uses several threads. So it is not clear that single-thread performance is the only significant factor, or that your unsubstantiated and context-free claim about Intel's single thread performance was relevant.
Not really. RPCS3 is very heavily multi-threaded, and the big thing that makes it perform better on Intel CPUs than AMD is differences in the performance of cross-core synchronization and data transfer. Current AMD desktop processors are split into CCXs containing at most 4 cores, and any communication between CCXs has to go via the seperate IO die which is relatively slow.
I wonder what memory they used with the Ryzen builds, as the memory frequency may decide the FCLK which affects some (all?) core to core communication speed