Expert systems were already supposed to revolutionize medicine .... in the 1980s.
Medicine's guilds won't permit loss of their jobs.
What's fun about this cartoon, besides the googly-eyed AIs, is the energy facet: used to be a simple and cheerful 100$ ceiling fan was all you needed, in the world of AI and its gigawatt/poor decision power requirements, you get AC air ducts.
That's correct — and you're right to point out this common reply by AI chat boxes. Let's breakdown why that happens:
📝 LLMs are predictive models:
When a specific pattern shows up a lot in the training data set — like your example reply, the LLM will be more likely to reply in a similar way in the future, just like when people walk through a patch of grass and create a visible path. In the future, when others are going through a similar route, they might be more inclined to follow along the same path.
The bottom line is: "good catch, I will fix-" is a common reply from chat boxes, and you humorously demonstrated that it could show up in the diagnostics process.
My knowledge on this is several years old, but back then, there were some types of medical imaging where AI consistently outperformed all humans at diagnosis. They used existing data to give both humans and AI the same images and asked them to make a diagnosis, already knowing the correct answer. Sometimes, even when humans reviewed the image after knowing the answer, they couldn't figure out why the AI was right. It would be hard to imagine that AI has gotten worse in the following years.
When it comes to my health, I simply want the best outcomes possible, so whatever method gets the best outcomes, I want to use that method. If humans are better than AI, then I want humans. If AI is better, then I want AI. I think this sentiment will not be uncommon, but I'm not going to sacrifice my health so that somebody else can keep their job. There's a lot of other things that I would sacrifice, but not my health.
My favourite story about it was that one time when neural network trained on x-rays to recognise tumors I think, was performing amazingly at study, better than any human could.
Later it turned out that the network trained on real life x-rays with confirmed cases, and it was looking for penmarks. Penmarks mean the photo was studied by several doctors, which mean it's more likely to be the case that needed second opinion, which more often than not means there is a tumour. Which obviously means that if the case wasn't studied by humans before, the machine performed worse than random chance.
That's the problem with neural networks, it's incredibly hard to figure out what exactly is happening under the hood, and you can never be sure about anything.
And I'm not even talking about LLM, those are completely different level of bullshit
well it's also that they used biased data. biased data is garbage data. The problem with these neural networks is the human factor, humans tend to be biased, subconsciously or consciously, hence the data they provide to these networks will often be biased as well. It's like that ML that was designed to judge human faces and it would consistently give non-whites lower scores, because it turned out the input data was mostly full of white faces.
That's why too high a level of accuracy in ML is always something that makes me squint... I don't trust it, as an AI researcher and engineer, you have to do the due diligence in understanding your data well before you start training.
Neural networks work very similarly to human brains, so when somebody points out a problem with a NN, I immediately think about whether a human would do the same thing. A human could also easily fake expertise by looking at pen marks, for example.
And human brains themselves are also usually inscrutable. People generally come to conclusions without much conscious effort first. We call it "intuition", but it's really the brain subconsciously looking at the evidence and coming to a conclusion. Because it's subconscious, even the person who made the conclusion often can't truly explain themselves, and if they're forced to explain, they'll suddenly use their conscious mind with different criteria, but they'll basically always come to the same conclusion as their intuition due to confirmation bias.
But the point is that all of your listed complaints about neural networks are not exclusively problems of neural networks. They are also problems of human brains. And not just rare problems, but common problems.
Only a human who is very deliberate and conscious about their work doesn't fall into that category, but that limits the parts of your brain that you can use. And it also takes a lot longer and a lot of very deliberate training to be able to do that. Intuition is a very important part of our minds, and can be especially useful for very high level performance.
Modern neural networks have their training data manipulated and scrubbed to avoid issues like you brought up. It can be done by hand, for additional assurance, but it is also automatically done by the training software. If your training data is an image, the same image will be used repeatedly. For example, it will be used in its original format. It can be rotated and used. Cropped and used. Manipulated using standard algorithms and used. Or combinations of those things.
Pen marks wouldn't even be an issue today, because images generally start off digital, and those raw digital images can be used. Just like any other medical tool, it wouldn't be used unless it could be trusted. It will be trained and validated like any NN, and then random radiologists aren't just relying on it right after that. It is first used by expert radiologists simulating actual diagnosis who understand the system enough to report problems. There is no technological or practical reason to think that humans will always have better outcomes than even today's AI technology.
iirc the reason it isn't used still is because even with it being trained by highly skilled professionals, it had some pretty bad biases with race and gender, and was only as accurate as it was with white, male patients.
Plus the publicly released results were fairly cherry picked for their quality.
Medical sciences in general have terrible gender and racial biases. My basic understanding is that it has got better in the past 10 years or so, but past scientific literature is littered with inaccuracies that we are still going along with.
I'm thinking drugs specifically, but I suspect it generalizes.
Yeah, there were also several stories where the AI just detected that all the pictures of the illness had e.g. a ruler in them, whereas the control pictures did not. It's easy to produce impressive results when your methodology sucks. And unfortunately, those results will get reported on before peer reviews are in and before others have attempted to reproduce the results.
To expand on this a bit AI in medicine is getting super good at cancer screening in specific use cases.
People now heavily associate it with LLMs hallucinating and speaking out of their ass but forget about how AI completely destroys people at chess. AI is already getting better than top physics models at weather predicting, hurricane paths, protein folding and a lot of other use cases.
AI's uses in specific well defined problems with a specific outcome can potentially become way more accurate than any human can. It's not so much about removing humans but handing humans tools to make medicine both more effective and efficient at the same time.
The problem is the use of ai in everything as a generic term. Algorithms have been around for awhile and im pretty sure the ai cancer detections are machine learning that are not at all related to LLMs.
That's because the medical one (particularly good ar spotti g cancerous cell clusters) was a pattern and image recognition ai not a plagiarism machine spewing out fresh word salad.
One of the large issues was while they had very good rates of correct diagnosis, they also had higher false positive rates. A false cancer diagnosis can seriously hurt people for example
Iirc the issue was that the researchers left the manufacturer's logo on the scans.
All of the negative scans were done by the researchers on the same equipment while the positive scans were pulled from various sources. So the AI only learned to identify which scans had the logo.
The important thing to know here is that those AI were trained by very experienced radiologists who are physicians that specialize in reading imaging. The AI's wouldn't have this capability if the humans didn't train them.
Also, the imaging that AI performs well with is fairly specific, and there are many kinds of imaging techniques and diagnostic applications that the AI is still very bad at.
Except we didn't call all of that AI then, and it's silly to call it AI now. In chess, they're called "chess engines". They are highly specialized tools for analyzing chess positions. In medical imaging, that's called computer vision, which is a specific, well-studied field of computer science.
The problem with using the same meaningless term for everything is the precise issue you're describing: associating specialized computer programs for solving specific tasks with the misapplication of the generative capabilities of LLMs to areas in which it has no business being applied.
chess engines are, and always have been called, AI. computer vision is and always has been AI.
the only reason you might think they’re not is because in the most recent AI winter in which those technologies experienced a boom they avoided terminology like “AI” when requesting funding and advertising their work because people like you who had recently decided that they’re the arbiters of what is and isn’t intelligence.
turing once said if we were to gather the meaning of intelligence from a gallup poll it would be patently absurd, and i agree.
but sure, computer vision and chess engines, the two most prominent use cases for AI and ML technologies - aren’t actual artificial intelligence, because you said so. why? idk. i guess because we can do those things well and the moment we understand something well as a society people start getting offended if you call it intelligence rather than computation. can’t break the “i’m a special and unique snowflake” spell for people, god forbid…
Machine Learning is the general field, and I think if we weren't wrapped up in the AI hype we could be training models to do important things like diagnosing disease and not writing shitty code or creating fantasy art work.
When it comes to ai I want it to assist. Like I prefer the robotic surgery where the surgeon controls the robot but I would likely skip a fully automated one.
Yeah this is one of the few tasks that AI is really good at. It's not perfect and it should always have a human doctor to double check the findings, but diagnostics is something AI can greatly assist with.
Losing unnecessary jobs is not a bad thing, it's how we as a society progress. The main problem is not having a safety net or means of support for those who need to find a new line of work.
It's called progress because the cost in frame 4 is just a tenth what it was in frame 1.
Of course prices will still increase, but think of the PROFITS!
Also, there'll be no one to blame for mistakes! Failures are just software errors and can be shrugged off! Increase profits and pay less for insurance! What's not to like?
Imagine an episode of House, but everyone except House is an AI. And he's getting more and more frustrated by them spewing nonsense after nonsense, while they get more and more appeasing.
"You idiot AI, it is not lupus! It is never lupus!"
"I am very sorry, you are right. The condition referred to Lupus does obviously not exist, and I am sorry that I wasted your time with this incorrect suggestion. Further analysis of the patient's condition leads me to suspect it is lupus."
Prove that you are not an AI agent, show your ID. The AI agent issues an ID and enters the room instead of patient number 7 because the patient was too lazy to come himself.
I hate AI slop as much as the next guy but aren’t medical diagnoses and detecting abnormalities in scans/x-rays something that generative models are actually good at?
They don't use the generative models for this. The AI's that do this kind of work are trained on carefully curated data and have a very narrow scope that they are good at.
That brings up a significant problem - there are widely different things that are called AI. My company's customers are using AI for biochem and pharm research, protein folding, and other science stuff.
Image categorisation AI, or convolutional neural networks, have been in use since well before LLMs and other generative AI. Some medical imaging machines use this technology to highlight features such as specific organs in a scan. CNNs could likely be trained to be extremely proficient and reading X-rays, CT, MRI scans, but these are generally the less operator dependant types of scan, though they can get complicated. An ultrasound for example is highly dependent on the skill of the operator and in certain circumstances things can be made to look worse or better than they are.
I don't know why the technology hasn't become more widespread in the domain. Probably because radiologists are paid really well and have a vested interest in preventing it... they're not going to want to tag the images for their replacement. It's probably also because medical data is hard to get permission for, to ethically train such a model you would need to ask every patient in for every type of scan it their images can be used for medical research which is just another form/hurdle to jump over for everyone.
It's certainly not as bad as the problems generative AI tend to have, but it's still difficult to avoid strange and/or subtle biases.
Very promising technology, but likely to be good at diagnosing problems in Californian students and very hit-and-miss with demographics which don't tend to sign up for studies in silicon valley
Basically AI is generally a decent answer to the needle in a haystack problem. Sure, a human with infinite time and attention can find the needle and perhaps more accurately than an AI could, but practically speaking if there's just 10 needles in a haystack it's considered a lost cause to find any of them.
With AI it might find in that same stack 30 needles, of which only 7 of them are the needles, which means the AI finds more wrong answers than right, but ultimately you do end up finding 7 needles when you would have missed all 10 before, coming out ahead.
So long as you don't let an AI rule out review of a scan that a human really would have reviewed, it seems a win to potentially have more overall scans get a decent review and maybe catch things earlier in otherwise impractical preventative scans
True! I'm an AI researcher and using an AI agent to check the work of another agent does improve accuracy! I could see things becoming more and more like this, with teams of agents creating, reviewing, and approving. If you use GitHub copilot agent mode though, it involves constant user interaction before anything is actually run. And I imagine (and can testify as someone that has installed different ML algorithms/tools on government hardware) that the operators/decision makers want to check the work, or understand the "thought process" before committing to an action.
Will this be true forever as people become more used to AI as a tool? Probably not.
My current "provider" is an NP. I like her, she's personable and does the basic stuff well enough. I can understand having her do the basic annual physical type stuff for relatively young and healthy people.
But, for one of my recent visits, they scheduled me with a doctor instead (dunno why), and the experience was honestly almost night and day for the better. Granted, the way my health insurance works (ugh USA), the NP visits only ever cost me a flat amount, perhaps $45 for the copay. The doctor's visit cost me the $45 copay, plus additional coinsurance down the line that I got billed a couple of months later because the clinic apparently charges two different rates depending on whether you see a doctor or not, I guess?