The first step would be to build an information collection pipeline that is in the same league as the doctors. That alone will be a monumental effort because doctors have shared human experiences to draw from and they are allowed to iteratively collect information.
I'm just complaining that it seems fantastically reductive to call the absence of such a pipeline "bad data" because developing such a pipeline would be a thousand times the effort of implementing an image detection model. Maybe a million times. It will require either NLP like none we have seen before or an expert system with so much buy-in from the experts and investors that it survives the thousand rounds of iterative improvement it needs to address 99% of the requirements.
Comparing issues like low resolution and noise to such a development effort seems like comparing apples to... jet fighters.
I'm just complaining that it seems fantastically reductive to call the absence of such a pipeline "bad data" because developing such a pipeline would be a thousand times the effort of implementing an image detection model. Maybe a million times. It will require either NLP like none we have seen before or an expert system with so much buy-in from the experts and investors that it survives the thousand rounds of iterative improvement it needs to address 99% of the requirements.
Comparing issues like low resolution and noise to such a development effort seems like comparing apples to... jet fighters.