In Thinking, Fast and Slow -- the author details a double blind trial where the did this. It was worse with humans and AI than with just AI. Humans think they can use AI as a guide and move it in the right direction. But the movements they made, on average, were bad.
Surely in this type of instance (looking at a scan to answer a yes/no question) the human and AI act independently, with the computer being a useful aid because it separately picks up a few of the human's false negatives. Assuming false negatives are a lot worse than false positives, this can only be a good thing.
If they lead to an unnecessary mastectomy then false positives are pretty bad. Not as bad as dying, obviously, but still a severe blow to a woman's identity and sense of self worth.
It's going to be a hard pill to swallow if you have to tell a woman "sorry, we removed your healthy breast because the computer made a mistake."
I think the idea of "screening" is that you don't just race off to a mastectomy the minute some AI model goes off. Of course, putting more false positives through a fallible process of review does run the risk you speak of.
It sounds like a smart hospital would run a patient through both human and AI screenings separately, and a different doctor to examine both results and evaluate the discrepancies. This way you would keep the strengths of both approaches, lowering the failure rates, and depending on the countries health care funding can be good business from the hospital's POV as they get to charge for the extra work as well as the better success rates to drive business.
And I wonder what happens if you apply machine learning to looking at the difference between AI and human screening results.