The numbers have been run on simulations of large wide cores, and the benefit of RV-C is pretty clear. Although since the release of the M1, I agree that there's probably a need for a BOOMv4 to publicly explore the problem space.
Going into rumor town: my understanding is that all of the companies working on high perf core are implementing RV-C, including those made of ex-Apple employees who worked on their cores. The tiny bit of extra decode complexity more than pays dividends in I$ pressure (which from a design perspective can let you get away with less I$, and therefore lower latency I$).