I would really rather like a benchmark purely focusing on diagnosis. Symptoms, patient history vs the real diagnosis. Maybe name this model House M.D 1.0 or something.
The other stuff is good to have but ultimately a model that focuses on diagnosing medical conditions is going to be the most useful. Look - we aren't going to replace doctors anytime soon but it is good to have a second opinion from an LLM purely for diagnosis. I would hope it captures patterns that weren't observed before. This is exactly the sort of thing game that AI can beat a human at - large scale pattern recognition.
The other stuff is good to have but ultimately a model that focuses on diagnosing medical conditions is going to be the most useful. Look - we aren't going to replace doctors anytime soon but it is good to have a second opinion from an LLM purely for diagnosis. I would hope it captures patterns that weren't observed before. This is exactly the sort of thing game that AI can beat a human at - large scale pattern recognition.