The main use case for LLMs is writing text nobody wanted to read. The other use case is summarizing text nobody wanted to read. Except they don’t do that either. The Australian Securities and…
Yes, thanks for clarifying what I meant! AI will never create anything unique unless prompted uniquely and even then it will tend to revert back to what you expect most.
ATTN: If you're coming into this thread to say, "The output of AI is bad because your prompts suck," I'm just proud that you managed to figure out how to use the internet at all. Good job, you!
I had GPT 3.5 break down 6x 45-minute verbatim interviews into bulleted summaries and it did great. I even asked it to anonymize people’s names and it did that too. I did re-read the summaries to make sure no duplicate info or hallucinations existed and it only needed a couple of corrections.
How did you make sure no hallucinations existed without reading the source material; and if you read the source material, what did using an LLM save you?
I also use it for that pretty often. I always double check and usually it's pretty good. Once in a great while it turns the summary into a complete shitshow but I always catch it on a reread, ask a second time, and it fixes things up. My biggest problem is that I'm dragged into too many useless meetings every week and this saves a ton of time over rereading entire transcripts and doing a poor job of summarizing because I have real work to get back to.
I also use it as a rubber duck. It works pretty well if you tell it what it's doing and tell it to ask questions.
They certainly do. For a while it was common to see AI-generated summaries under links to articles on lemmy, so I got a feel for them. Seems to me you would not need any fancy artificial intelligence to do equally well: Just take random excerpts, or maybe just read every third sentence.
Could it be because a statistical relation isn't the same as a semantic one? No, I must be prompting it wrong. I'll just add "engineer" to my title and then everyone will take me seriously.
You could use them to know what the text is about, and if it's worth your reading time. In this situation, it's fine if the AI makes shit up, as you aren't reading its output for the information itself anyway; and the distinction between summary and shortened version becomes moot.
However, here's the catch. If the text is long enough to warrant the question "should I spend my time reading this?", it should contain an introduction for that very purpose. In other words if the text is well-written you don't need this sort of "Gemini/ChatGPT, tell me what this text is about" on first place.
EDIT: I'm not addressing documents in this. My bad, I know. [In my defence I'm reading shit in a screen the size of an ant.]
(For clarity I'll re-emphasise that my top comment is the result of misreading the word "documents" out, so I'm speaking on general grounds about AI "summaries", not just about AI "summaries" of documents.)
The key here is that the LLM is likely to hallucinate the claims of the text being shortened, but not the topic. So provided that you care about the later but not the former, in order to decide if you're going to read the whole thing, it's good enough.
And that is useful in a few situations. For example, if you have a metaphorical pile of a hundred or so scientific papers, and you only need the ones about a specific topic (like "Indo-European urheimat" or "Argiope spiders" or "banana bonds").
That backtracks to the OP. The issue with using AI summaries for documents is that you typically know the topic at hand, and you want the content instead. That's bad because then the hallucinations won't be "harmless".
Is it only me, or is the linked article not super long on details & is reaching a conclusion from 2 examples? This is important & I need to hear more, & I’m generally biased against AI at this point— but the article isn’t doing enough to convince me
did you click through to any of the inline citations? David’s shorter articles on pivot mostly gather and summarize those, so if you need to read the original research and its conclusions that’s where to go
Ok? I don't have another human available to skim a shitload of documents for me to find answers I need and I don't have time to do ot myself. AI is my best option.
Yep. Go ahead and ignore all the cases where it's getting answers correct and actually helping. We're all just hallucinating, it's in no way my lived experience.
Your reality is the prime reality and we're the NPC's.
I didn't read the post at all because its premise is irrelevant to my situation. If I had another human to read documentation for me I would do that. I don't so the next best thing is AI. I have to double check its findings but it gets me 95% of the way there and saves hours of work. It's a useful tool.
The problem is not the LLMs, but what people are trying to do with them.
They are currently spoons, but people are desperately wishing they were katanas.
They work really well for soup, but they can't cut steak. But they're being hyped as super ninja steak knives, and people are getting pissed when they can't cut steak.
If you give them watery, soupy tasks they can do successfully, they can lighten your workload, as long as you're aware of what they are and aren't good at.
What people want LLMs to be able to do, ie. "Steak" tasks:
write complex documents
apply complex knowledge/rules to a situation
Write complex code and create entire programs based on vague description
What LLMs can currently do ie. "Soup" tasks:
check this document and fix all spelling, punctuation and grammatical errors
summarise this paragraph as dot points
write a python program that sorts my photographs into folders based on the year they were taken
Half of Lemmy is hyping katanas, the other half is yelling "Why won't my spoon cut this steak?!! AI is so dumb!!!"
Update: wow, the pure vitriol pouring out of the replies is just stunning. Seems there are a lot of you out there who have, in one way or another, tied your ego very strongly to either the success or failure of AI.
Take a step back, friends, and go outside for a while.
Clearly this post is about LLMs not succeeding at this task, but anecdotally I've seen it work OK and also fail. Just like humans, which is the benchmark but they are faster.
I keep having to remind people. Chatgpt is only as good as the prompt you give it. I am astounded as the amount of garbage that some people get, but I also know that it's generally because their prompts are garbage.
Sometimes it's output sucks, even with good input. But likely, if the output is bad, the input was bad.