Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well.

When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.

lol is this news? I mean we call it AI, but it’s just LLM and variants it doesn’t think.

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

I don't think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called "complex") puzzles. Like Towers of Hanoi but with 25 discs.

The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.

The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don't have an answer for why this is, but they suspect that the reasoning doesn't scale.

Peak pseudo-science. The burden of evidence is on the grifters who claim "reason". But neither side has any objective definition of what "reason" means. It's pseudo-science against pseudo-science in a fierce battle.

No way!

Statistical Language models don't reason?

But OpenAI, robots taking over!

NOOOOOOOOO

SHIIIIIIIIIITT

SHEEERRRLOOOOOOCK

this is so Apple, claiming to invent or discover something "first" 3 years later than the rest of the market

Most humans don't reason. They just parrot shit too. The design is very human.

What's hilarious/sad is the response to this article over on reddit's "singularity" sub, in which all the top comments are people who've obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don't understand AI or "reasoning". It's a weird cult.

Just fancy Markov chains with the ability to link bigger and bigger token sets. It can only ever kick off processing as a response and can never initiate any line of reasoning. This, along with the fact that its working set of data can never be updated moment-to-moment, means that it would be a physical impossibility for any LLM to achieve any real "reasoning" processes.

You know, despite not really believing LLM "intelligence" works anywhere like real intelligence, I kind of thought maybe being good at recognizing patterns was a way to emulate it to a point...

But that study seems to prove they're still not even good at that. At first I was wondering how hard the puzzles must have been, and then there's a bit about LLM finishing 100 move towers of Hanoï (on which they were trained) and failing 4 move river crossings. Logically, those problems are very similar... Also, failing to apply a step-by-step solution they were given.

Fucking obviously. Until Data's positronic brains becomes reality, AI is not actual intelligence.

AI is not A I. I should make that a tshirt.

Why would they "prove" something that's completely obvious?

The burden of proof is on the grifters who have overwhelmingly been making false claims and distorting language for decades.

You assume humans do the opposite? We literally institutionalize humans who not follow set patterns.

does ANY model reason at all?

WTF does the author think reasoning is

I think it's important to note (i'm not an llm I know that phrase triggers you to assume I am) that they haven't proven this as an inherent architectural issue, which I think would be the next step to the assertion.

do we know that they don't and are incapable of reasoning, or do we just know that for x problems they jump to memorized solutions, is it possible to create an arrangement of weights that can genuinely reason, even if the current models don't? That's the big question that needs answered. It's still possible that we just haven't properly incentivized reason over memorization during training.

if someone can objectively answer "no" to that, the bubble collapses.

It's all "one instruction at a time" regardless of high processor speeds and words like "intelligent" being bandied about. "Reason" discussions should fall into the same query bucket as "sentience".

Thank you Captain Obvious! Only those who think LLMs are like "little people in the computer" didn't knew this already.

Yah of course they do they’re computers

This sort of thing has been published a lot for awhile now, but why is it assumed that this isn't what human reasoning consists of? Isn't all our reasoning ultimately a form of pattern memorization? I sure feel like it is. So to me all these studies that prove they're "just" memorizing patterns don't prove anything other than that, unless coupled with research on the human brain to prove we do something different.

This has been known for years, this is the default assumption of how these models work.

You would have to prove that some kind of actual reasoning capacity has arisen as... some kind of emergent complexity phenomenon.... not the other way around.

Corpos have just marketed/gaslit us/themselves so hard that they apparently forgot this.

hey I cant recognize patterns so theyre smarter than me at least

stochastic parrots. all of them. just upgraded “soundex” models.

this should be no surprise, of course!

So, what your saying here is that the A in AI actually stands for artificial, and it's not really intelligent and reasoning.

Huh.

Employers who are foaming at the mouth at the thought of replacing their workers with cheap AI:

🫢

Of course, that is obvious to all having basic knowledge of neural networks, no?

OK, and? A car doesn't run like a horse either, yet they are still very useful.

I'm fine with the distinction between human reasoning and LLM "reasoning".

It's not just the memorization of patterns that matters, it's the recall of appropriate patterns on demand. Call it what you will, even if AI is just a better librarian for search work, that's value - that's the new Google.

I mean... Is that not reasoning, I guess? It's what my brain does-- recognizes patterns and makes split second decisions.

Would like a link to the original research paper, instead of a link of a screenshot of a screenshot

The difference between reasoning models and normal models is reasoning models are two steps, to oversimplify it a little they prompt "how would you go about responding to this" then prompt "write the response"

It's still predicting the most likely thing to come next, but the difference is that it gives the chance for the model to write the most likely instructions to follow for the task, then the most likely result of following the instructions - both of which are much more conformant to patterns than a single jump from prompt to response.

I use LLMs as advanced search engines. No ads or sponsored results.

So they have worked out that LLMs do what they were programmed to do in the way that they were programmed? Shocking.

What a dumb title. I proved it by asking a series of questions. It’s not AI, stop calling it AI, it’s a dumb af language model. Can you get a ton of help from it, as a tool? Yes! Can it reason? NO! It never could and for the foreseeable future, it will not.

It’s phenomenal at patterns, much much better than us meat peeps. That’s why they’re accurate as hell when it comes to analyzing medical scans.

While I hate LLMs with passion and my opinion of them boiling down to being glorified search engines and data scrapers, I would ask Apple: how sour are the grapes, eh?

edit: wording

It has so much data, it might as well be reasoning. As it helped me with my problem.

What’s the news? I don’t trust this guy if he thought it wasn’t known that AI is overdriven pattern matching.