honestly, its pretty good, and it still works if I use a lower resolution screenshot without metadata (I haven't tried adding noise, or overlaying something else but those might break it). This is pixelwave, not midjourney though.
There are a bunch of reasons why this could happen. First, it's possible to "attack" some simpler image classification models; if you get a large enough sample of their outputs, you can mathematically derive a way to process any image such that it won't be correctly identified. There have also been reports that even simpler processing, such as blending a real photo of a wall with a synthetic image at very low percent, can trip up detectors that haven't been trained to be more discerning. But it's all in how you construct the training dataset, and I don't think any of this is a good enough reason to give up on using machine learning for synthetic media detection in general; in fact this example gives me the idea of using autogenerated captions as an additional input to the classification model. The challenge there, as in general, is trying to keep such a model from assuming that all anime is synthetic, since "AI artists" seem to be overly focused on anime and related styles...
Honestly, they should fight fire with fire? Another vision model (like Qwen VL) would catch this
You can ask it "does this image seem fake?" and it would look at it, reason something out and conclude it's fake, instead of... I dunno, looking for smaller patters or whatever their internal model does?
Isn't there a whole thing about if you average out colors on AI generated photos you get a uniform beige grey toned color? As the brightness is usually about 50/50 from the original noise map. (Added in for the people who I confused with colors)
I don't get why these tools don't just do that but I guess you got to keep the marketing up of using AI to find a solution.