Researchers Jailbreak AI by Flooding It With Bullshit Jargon
Researchers Jailbreak AI by Flooding It With Bullshit Jargon
www.404media.co Researchers Jailbreak AI by Flooding It With Bullshit Jargon
LLMs don’t read the danger in requests if you use enough big words.

You're viewing a single thread.
All comments
12
comments
I wonder if they tried this on DeepSeek with Tiananmen square queries
4 0 ReplyNo, those filters are performed by a separate system on the output text after it's been generated.
4 0 Replymakes sense though I wonder if you can also tweak the initial prompt so that the output is also full of jargon so that output filter also misses the context
1 0 ReplyYes. I tried it, and it only filtered English and Chinese. If I told it to use Spanish, it didn't get killed.
1 0 Reply
12
comments
Scroll to top