Skip Navigation

You have activated the Falsifiability trap card - LLMs as tutors = lol

awful.systems

"Not all AI content is spam, but I think right now all spam is AI content." - awful.systems

User @theneverfox@pawb.social has posed a challenge when trying to argue that... wait, what?

The biggest, if rarely used, use case [of LLMs] is education - they’re an infinitely patient tutor that can explain things in many ways and give you endless examples.

Lol what.

Try reading something like Djikstra’s algorithm on Wikipedia, then ask one to explain it to you. You can ask for a theme, ask it to explain like you’re 5, or provide an example to check if you understood and have it correct any mistakes

It’s fantastic for technical or dry topics, if you know how to phrase things you can get quick lessons tailored to be entertaining and understandable for you personally. And of course, you can ask follow up questions

Well, usually AI claims are just unverifiable gibberish but this? Dijkstra's algorithm is high school material. This is a verifiable claim. And oh boi, do AI claims have a long and storied history of not standing up to scrutiny...

I sincerely didn't expect it'd take me so little, but well, this is patently wrong. One, this is not Dijkstra's algorithm. Two, picking the shortest road always is obviously incorrect, see this map:

Green path is the shortest to candy. Red path is what you get by always following the shortest road.

Dijkstra's algorithm picks the closest node that was seen thus far and tries to make paths better by expanding from there, the idea being that if some node is far away then paths through it are going to be long, so we don't need to look at them until there's no other option. In this case it'll immediately see A at distance 1 and Candy at distance 2, expand from A (since it's closer) to get to B in distance 2; after that it will look at B and Candy, but see it cannot improve from there and terminate.

Let's see what ChatGPT will tell me when I bring this counterexample to its stupid algorithm to its attention.

It fucking doubles down! It says it's wrong and then gives the same stupid algorithm just with "map" changed to "maze" and "candy" changed to "toy"! And I wanted candy!

Okay, maybe saying "like I'm 5" was wrong, let's try to recreate something closer to what @theneverfox wanted.

Okay, at least it's not incorrect, there are no lies in this, although I would nitpick two things:

  1. It doesn't state what the actual goal of the algorithm is. It says "fundamental method used in computer science for finding the shortest paths between nodes in a graph", but that's not precise; it finds the shortest paths from a node to all other nodes, whereas the wording could be taken to imply its between two nodes.
  2. "infinity (or a very large number)" is very weird without explanation. Dijkstra doesn't work if you put "a very large number", you have to make sure it's larger than any possible path length (for example, sum of all weights of edges would work).

Those are rather pedantic and I can excuse them. The bigger issue is that it doesn't really tell you anything that you wouldn't get from the Wikipedia article? It lifts sentences from there changing the structure, but it doesn't make it any clearer. Actually, Wikipedia has an example in the text describing the "Iterative Process" steps, but ChatGPT threw it away. What's the value here, exactly?

Let's try asking something non-obvious that I didn't get first when learning Dijkstra:

What?! This is nonsense! Gibberish! Bollocks!

It does really well at first, no wonder, since the first sentences are regurgitated from Wikipedia. Then it gives a frankly idiotic example of a two vertex graph where Dijkstra does give the correct answer since it's trivial and there's only one edge. But it's really easy to come up with an actual counterexample, so I asked for it directly, and got... Jesus Christ. If images are better for you, here is the graph described by ChudGPT:

Dijkstra here correctly picks the shortest path to C:

  1. Distances = { 0, ∞, ∞ }, active = [A at 0], pick edges from A
  2. Distances = { 0, 1, 4 }, active = [B at 1, C at 4], pick edges from B
  3. Distances = { 0, 1, -1 }, active = [C at -1], pick edges from C
  4. Distances = { 0, 1, -1 }, end.

This is not a counterexample to Dijkstra. ChatGPT even says that! Its step 3 clearly finds the distance 1 to C! And then it says the actual shortest path is 4! A fucking 7 year old can see this is wrong!

It's very easy to change this to an actual counterexample as well, just replace the weight on A->B with 5. The shortest path is then 3, but because of how Dijkstra works it will visit C first, save the distance of 4, and then never revisit C. This is the actual reason Dijkstra doesn't work.

It fails miserably to explain the basics, it fails spectacularly to explain a non-obvious question an actual student just introduced to Dijkstra might have, and, I left my specialité for the end:

More computer-science-savvy among you are surely already laughing. ChatGPT just solved P=NP! With Floyd-Warshall!

Again, it starts off good -- Dijkstra indeed cannot find longest paths. The next sentence is technically correct, though rather hollow.

"Finding the longest path in a graph is a more complex problem and typically involves different algorithms or approaches." Ye, that's correct, it's extremely complex -- it's what we call an NP-complete problem 1! It's currently unknown whether these problems are solvable in reasonable time. It then gives the "negate the weights" approach and correctly remarks it doesn't actually work, and then it absolutely clowns itself by saying you can solve it with Floyd-Warshall. You can't. That's just plain dumb. How would it?

I'm not going to delve deeper into this. This is a bullshit generator that has a passing knowledge of the Wikipedia article (since it trained on it), but shows absolutely no understanding of the topic it covers. It can repeat the basic sentences it found, but it cannot apply them in any new contexts, it cannot provide sensible examples, it stumbles over itself when trying to explain a graph with three fucking vertices. If it were a student on an oral exam for Intro to Algorithms I would fail it.

And as a teacher? Jesus fucking Christ, if a guy stumbled into a classroom to teach first year students, told them that you can find shortest paths by greedily choosing the cheapest edge, then gave a counter-counterexample to Dijkstra, and finally said that you can solve Longest Path in O(n3), he better be also fucking drunk, cause else there'd be no excuse! That's malpractice!

None of this is surprising, ChudGPT is just spicy autocomplete after all, but apparently it bears laying out. The work of an educator, especially in higher education, requires flexibility of mind and deep understanding of the covered topics. You can't explain something in simple words if you don't actually get it, and you can't provide students with examples and angles that speak to them and help in their learning process if you don't understand the topic from all those angles yourself. LLMs can't do that, fundamentally and by design.

It’s fantastic for technical or dry topics

Give me a fucking break.


1. Pedantically, it's NP-hard, the decision version is NP-complete. This footnote is to prevent some smartass from correcting me in the comments...

50 comments
  • This drives me up the wall. Any time I point this out, the AI fanboys are so quick to say “well, that’s v3.x. If you try on 4.x it’s actually much better.” Like, sure it is. These things are really good at sounding like they know what they’re talking about, but they will just lie. Especially any time numbers or math are involved. I’ve had a chat bot tell me things like 10+3=15. And like you pointed out, if you call it out, it always says “oh my bad” and then just lies some more or doubles down. It would be cool if they could be used to teach things, but I’ve tried it for learning the rules to games, but it will just lie and fill in important numbers with other, similar numbers and present it as completely factual. So if I ever used it for something I truly didn’t know about, I wouldn’t be able to trust anything it said

  • What is great is that it only really starts approaching correct once you tell it to essentially copy paste from wikipedia.

    Also, if some rando approached me on the street and showed me the wikipedia article for dijkstra’s and asked for me to help explain it, my first-ass instinct would be to check if there was a simple english version of the article, and go from there.

    Disclaimer: I glossed over said SE article just now. It might not be a great explanation or even correct, but hey, it already exists and didn’t require 1.21 Jiggowatts to get there.

  • edit I responded to this from an educational systems framing, not a single person using LLM/chatbot to try to understand something -framing, which was a bit awkward my bad…

    I find the idea of LLMs for education really exciting… in a vacuum. In our current society where we pathologically seem unable to value human skills like teaching and the jobs of teachers in general, technology is going to be used as a cudgel to rationalize further divestment of resources from teachers and teaching. One only has to look at the educational “reform” program Bill Gates funded and pushed that warped the education system in the US for years and years that no teachers actually wanted and that received unwavering support from the general public and government because Smart Computer Guys are actually smarter than everyone else even in contexts that have nothing to do with computers… sigh

    Beyond all of that I don’t really think LLMs are that useful when being prompted in a one on one conversation. There is just no way to tell how much you are being bullshitted. I do find that asking the same question to multiple LLMs on arena.lmsys.org does get me fairly quickly to technical answers however, since I can evaluate from a series of answers and cross reference (obviously you still need to google at the end of it to verify, and it is a fair point why you wouldn’t just do that in the first place).

    I think in the far future (a positive vision of it) a good bit of education will be crafting questions and prompts for LLMs and then critically evaluating from a set of answers given from different LLMs/chatbots. The homework assignment could be evaluated based on how critically and intelligently a student compared several different LLM answers and triangulated an answer from it.

    All that being said, LLMs are 1000% the next bitcoin, they are absolutely part of the enshittification of search engines and most of the people who are excited about them are insufferable…. but I can still step outside of that and see that there is an educational utility here, however even the act of focusing on the educational utility of LLMs in conversations about education is dangerous since it provides such a clear route for further cutting funding and resources for teachers.

    Like who gives a shit about AI next to the fact that we treat human teachers like trash and give them no funding to do their jobs so they have to shell out from their own money to buy classroom supplies (!?!?!???!?!!!!????). The problem is we think education isn’t worth investing in and that teachers don’t have a professional skillset (they are just burger flippers but they pass out worksheets to kids instead of making fast food) that should be respected and nurtured.

    In other words, computer people are so up their own ass they really are incapable of understanding even the basic practical problems of every day teaching that must be overcome. Those skills required to be an effective teacher (especially to determine what a human really needs to learn) are invisible to them, they are fuzzy soft skills that are at best nice to have and at worst an annoying set of behaviors and social expectations to memorize and perform. These people with this mindset will never ever be able to do anything but ruin education (again, see Bill Gates) with computers.

    I am however interested in the long term to see how teachers and educators who also understand LLMs and chatbots will integrate these things into the process of teaching, but I am only interested if the computer isn’t treated as more important than the human connection between the teacher and student…

  • Wikipedia's coverage of math and science topics is... uneven, but that article looks to be on the decent side. It's good enough that if you say you got absolutely nothing from it, I'd be inclined to blame your study skills before I blamed the article. And guess what? Pressing the lever to get nuggets of extruded math-substitute product will not help you develop those study skills.

    • it was such a weird choice of an article from our esteemed guest, and that they considered the article particularly complex or hard to understand revealed so much about their quality as a supposed programming tutor (though the weird anti-math stuff also did them no favors)

      like, is this really the most complex algorithm that the LLM could generate convincing bullshit for? or did their knowledge going in end here and so they didn’t even know how to ask the thing about slightly harder CS shit? I really hope their whole tutoring thing is them being an unhelpful reply guy like they were in our thread and they’re not going around teaching utterly wrong CS concepts to folks just looking to learn, but the outlook’s not too bright given how much CS woo has entered the discourse as a direct result of people regurgitating the false knowledge they’ve gotten from LLMs.

  • Nice work! Don't see a lot of this, and it's a common experience with LLMs today

    I'd say they are ok for learning, but only for the simplest stuff. The syntax of a programming language you don't know, and would be trivial to google. Basic info about cats. Some models are a little better than others, but it feels like throwing more hardware/data at it is no longer the correct answer, and breakthroughs are needed

    Mainly the issue here is trust. You never know when LLMs switch from being a decent teacher to being a convincing liar. And that's kinda the whole thing with teachers, you're supposed to trust them. Just chatting with someone about a topic you're both only casually familiar with is different

    Generally LLMs fail spectacularly when it comes to popularity of ideas vs fundamentals of ideas. A single new publication in a physics journal could fundamentally change our perception of the universe, but the LLM is much more likely to describe a common viewpoint that it's been trained on a lot. Even with the latest GPT it was very painful talking to it about black holes and holographic universe

    • One thing I didn't focus on but is important to keep in context is that the cost of a semi-competent undergrad TA is like couple k a month, a coffee machine, and some pizza; whereas the LLM industrial complex had to accelerate the climate apocalypse by several years to reach this level of incompetence.

      Sam Altman personally choked an orca to death so that this thing could lie to me about graph theory.

    • The syntax of a programming language you don’t know, and would be trivial to google

      yeah so I tried that already, and it turns out these things are both dogshit and insidious in those cases - if I were a less informed user, I would've had bugs baked in at a deep level that would've taken hours to days to figure out down the line

      it feels like throwing more hardware/data at it is no longer the correct answer

      it was never the correct answer

      Mainly the issue here is trust. You never know when LLMs switch from being a decent teacher to being a convincing liar. And that’s kinda the whole thing with teachers, you’re supposed to trust them

      wat. I don't really understand this point/mention - what did you mean to convey by bringing this up?

      You never know when LLMs switch from being a decent teacher to being a convincing liar.

      well.. no? 1) it's always synthesising, there is no distinction between truth and falsehood. it is always creating something. some of it turns out to be factual (or factual enough) that you're parsing as "oh it gave me true bits", but that's because your brainmeats are doing some real fancy shit on the fly because they've been tuned to deal with information filtering over a couple millennia. handy, right?

      but the LLM is much more likely to describe a common viewpoint that it’s been trained on a lot.

      you mean the mass averaging machine is going to produce something that might be an average of the mass data? shock, horror!

      why not just find a scientist to become friends with? it's so much easier (and you get to enjoy them going fully human nerdy about really cool niche shit)

  • delve

    lol

  • LLMs are pretty good at stuff that an untrained human can do as well. Algorithms and data structures are wayyy to specialized.

    I recently asked gpt4 about semiconductor physics - not a chance, it simply does not know.

    But for general topics it's really good. For one reason that you simply glossed over - you can ask it specific questions and it will always be happy to answer.

    Okay, at least it's not incorrect, there are no lies in this, although I would nitpick two things:

    1. It doesn't state what the actual goal of the algorithm is. It says "fundamental method used in computer science for finding the shortest paths between nodes in a graph", but that's not precise; it finds the shortest paths from a node to all other nodes, whereas the wording could be taken to imply its between two nodes.
    2. "infinity (or a very large number)" is very weird without explanation. Dijkstra doesn't work if you put "a very large number", you have to make sure it's larger than any possible path length (for example, sum of all weights of edges would work).

    Those nitpicks are something you can ask it to clarify! Wikipedia doesn't do that. If you are looking for something specific and it's not in the Wikipedia article - tough luck, have fun digging through different articles or book excerpts to piece the missing pieces together.

    The meme about stack overflow being rude to totally valid questions does not come from nothing. And ChatGPT is the perfect answer to that.

    Edit: I'm late, but need to add that I can't reproduce OPs experience at all. Using GPT4 turbo preview, temperature 0.2, the AI correctly describes dijkstras algorithm. (Distance from one node to all other nodes, picking the next node to process, initializing the nodes, etc).
    To respond to one of the nitpicks I asked the AI what to do when my "distance" data type does not support infinity (a weak point of the answer that does not require me to know the actual bound to question the answer). It correctly told me a value larger than any possible path length is required.

    It also correctly states that Dijkstras algorithm can't find the longest path in a graph and that the problem is NP hard for general graphs.

    For negative weights it explains why Dijkstra doesn't work (Dijkstra assumes once a node is marked as completed it has found its shortest distance to the start. This is no longer a valid assumption if edge weights can be negative) and recommends the Bellman-Ford algorithm instead. It also gives a short overview of the Bellman-Ford algorithm.

    • The issue with those nitpicks is that you need to already know about Dijkstra to pick up that something is fishy and ask for clarification.

      I call bs on all of this, if anything my little experiment shows that sure, you can ask it for clarification (like giving a counterexample) and it will happily and gladly lie to you.

      The fact that an LLM will very politely and confidently feed you complete falsehoods is precisely the problem. At least StackOverflow people don't give you incorrect information out of rudeness.

    • Those nitpicks are something you can ask it to clarify! Wikipedia doesn’t do that.

      This is patently unfair with regards to the Dijktra's algo article. It has multiple sections with both a discussion of the algorithm in text, a section with pseudocode, and a couple of animations. Now I've had to internalize the algo because of Advent of Code, but I found the wiki article quite helpful in that regard.

      Does it require more patience than just asking an LLM? Maybe. But it will reward you more.

      I'd love an actual example of an obtuse Wiki article where an LLM does better. I doubt it really exists, because training an LLM involves... reading Wikipedia, and following the examples, and modelling an output from that.

      It also provides an example that these models don't think. You'd expect a precursor to an AGI to be able to understand math. It's a huge body of work, it's (mostly) internally consistent, and it would be a huge boon both for math tyros and pros if it existed. Instead LLMs have statistical "knowledge" of math, nothing more.

    • Those nitpicks are something you can ask it to clarify! Wikipedia doesn’t do that.

      https://en.wikipedia.org/wiki/Wikipedia:Reference_desk

      Those nitpicks are something you can ask it to clarify! Wikipedia doesn’t do that. If you are looking for something specific and it’s not in the Wikipedia article - tough luck, have fun digging through different articles or book excerpts to piece the missing pieces together.

      Or, as we called it in my day, studying.

50 comments