My impression of what's holding back improvements to CV and ML for medical images is that close to none of the research in the field will ever be put into practice. There's a huge amount of research done and then the companies actually producing the software used by technicians and doctors choose a tiny number of things to pick up, incorporate, test, get certified, and it takes years. People working in the field have admitted to me they have few to no hopes.
I work in this field - IMO, your assessment is correct insofar as the absolute need for commercialization of research done. However, we haven't gotten to the super-human performance for medical images yet. We don't even have the ImageNet-equivalent, nor do we have any open source (or just source & weight available) models that Google and other big corp claim to be so wonderful. Most of people in the field, rightly so, then dismiss anything like Med-Palm M and other PR-like papers from these groups. Why base your career (both academically and product-wise) on this?
> "we haven't gotten to the super-human performance for medical images yet."
I guess that at least in France, most people living elsewhere than large towns do not need that.
What they need (IMO) is competent radiologists in a time of medical deserts.
Because even when there is staff, this staff do not spent enough time analyzing images while being quite expensive. Their main goal is to make money quickly.
A couple of years ago I had built a product for health systems in the computer vision space - and ran into the challenge of trying to commercialize it (with no luck). Even if the product/technology is excellent, the regulatory hurdles & red tape make it insanely difficult to get this sort of stuff commercialized in a healthcare context.
Many people in the field are physicists, and they consistently state that even if the performance is better than human, they will not use ML in the field, simply due to explainability. Physicists like having very explainable analytical equations, so handing them a lot of matrices with how to compose them is not convincing.
They may still lose out in time, but not without a battle over explainability.
Also, Google is always and constantly making announcements that Bard or whatever IA tech (Music-LM, etc) is supposedly beating all the competitors, but nobody can use or see such models, only very inferiors ones, curious, right ( ͡° ͜ʖ ͡°) ?
This type of PR research is what is really holding back AI for medical images.