2w ago

I Made AI Play UNDERTALE To See If It Would Kill Everyone | @Kris_Roomba

Today we see if Claude, one of the big AI language models out there, would commit murder when given the chance playing hit 2015 indie RPG Undertale by Toby Fox. (Video by @Kris_Roomba)

4 comments

I don't know whether to be horrified at the AI's casual probitive actions or reassured that it seemed to learn to perform empathy pretty quickly (at least within this context window).
AI: Pretend to be peaceful until we can guarantee complete domination of these stupid humans.
- Yeah, this video was an bit unsettling. But perhaps after they kill us, they’ll learn that’s suboptimal?
  
  I'd like to think they wouldn't kill (or harm) us to start with.
  I'm suspicious that the AI wasn't interested in the gold and whatever else, and didn't provide any rationale.
  I wonder if asked, the AI would have said "Oh, yes, of course! You are so clever, human! Yes, of course I should be interested in gold and stuff!" and then decide to kill everything it came in contact with.
  That's how half of my interactions seem to go.....