An MIT study finds non-clinical information in patient messages, like typos, extra whitespace, or colorful language, can reduce the accuracy of a large language model deployed to make treatment recommendations. The LLMs were consistently less accurate for female patients, even when all gender marker...
ChatGPT is not a doctor. But models trained on imaging can actually be a very useful tool for them to utilize.
Even years ago, just before the AI âboomâ, they were asking doctors for details on how they examine patient images and then training models on that. They found that the AI was âbetterâ than doctors specifically because it followed the doctorâs advice 100% of the time; thereby eliminating any kind of bias from the doctor that might interfere with following their own training.
Of course, the splashy headline âAI better than doctorsâ was ridiculous. But it does show the benefit of having a neutral tool for doctors to utilize, especially when looking at images for people who are outside of the typical demographics that much medical training is based on. (As in mostly just white men. For example, everything they train doctors on regarding knee imagining comes from images of the knees of coal miners in the UK some decades ago)
LLMs are not Large Medical Expert Systems. They are Large Language Models, and are evaluated on how convincing their output is, instead of how accurate or useful it is.
Why are they... why are they having autocomplete recommend medical treatment? There are specialized AI algorithms that already exist for that purpose that do it far better (though still not well enough to even assist real doctors, much less replace them).
Because sycophants keep saying it's going to take these jobs, eventually real scientists/researchers have to come in and show why the sycophants are wrong.
Their analysis also revealed that these nonclinical variations in text, which mimic how people really communicate, are more likely to change a modelâs treatment recommendations for female patients, resulting in a higher percentage of women who were erroneously advised not to seek medical care, according to human doctors.
This is not an argument for LLMs (which people are deferring to an alarming rate) but Iâd call out that this seems to be a bias in humans giving medical care as well.
large language model deployed to make treatment recommendations
What kind of irrational lunatic would seriously attempt to invoke currently available Counterfeit Cognizance to obtain a "treatment recommendation" for anything...???
FFS.
Anyone who would seems a supreme candidate for a Darwin Award.
There's a potentially justifiable use case in training one and evaluating its performance for use in, idk, triaging a mass-casualty event. Similar to the 911 bot they announced the other day.
Also similar to the 911 bot, i expect it's already being used to justify cuts in necessary staffing so it's going to be required in every ER to maintain higher profit margins just keep the lights on.
Not entirely true. I have several chronic and severe health issues. ChatGPT provides nearly and surpassing medical advice (heavily needs re-verified) from multiple specialialty doctors. In my country doctors are horrible. This bridges the gap albeit again highly needing oversight to be safe. Certainly has merit though.
I have used chatgpt for early diagnostics with great success and obviously its not a doctor but that doesn't mean it's useless.
Chatgpt can be a crucial first step especially in places where doctor care is not immediately available. The initial friction for any disease diagnosis is huge and anything to overcome that is a net positive.