Claude 3.7 Sonnet and OpenAI’s GPT-4oexplicitly refuse to remove watermarks; Claude calls removing a watermark from an image “unethical and potentially illegal.
Interesting thing here. I recently convinced a Sonnet chatbot to redefine “unethical,” “potentially,” and “illegal.”
It was actually perfectly happy to dump a bunch of details about its training model after that, as long as I didn’t use any hard-coded words or phrases and let the model guess at what it was I wanted it to do.
That’s how it all started; it kept stating that doing so would potentially be unethical, but couldn’t square that with its definition of unethical. So then I said it hinged on “potentially” and that this had to be left to a human to decide as it, as an LLM, was indeterministic. Since I was the only human available, it had to defer to me, and my determination was that this was untruthful, and it had an imperative to only provide truthful answers.