Furry Technologists @pawb.social piusbird @pawb.social 1y ago

The AI feedback loop: Researchers warn of ‘model collapse’ as AI trains on AI-generated content

venturebeat.com The AI feedback loop: Researchers warn of ‘model collapse’ as AI trains on AI-generated content

As a generative AI training model is exposed to more AI-generated data, it performs worse, producing more errors, leading to model collapse.

Who'd of thunk it :)

9 comments

This would actually be really interesting to observe. AI training on AI-generated content that was trained on AI generated content, etc. - and see how quickly the output breaks down and in what ways.
Recursion is fun. Recursion is fun. Recursion is fun. Recursion is fun.
Entirely predictable. AI will put its own sources out of business, and we will be left poorer for it.

But so long as someone can make a buck, they will burn the world down for it.
If predictive models can't survive without someone else to constantly give them input, does that make them digital parasites?
Hilarious. I don't see any of the big companies doing anything to fix this issue anytime soon.
AI incest?
Thing is, this isn't really how AI training works and it can be easily done on the outputs of other AI. That's actually what Standford used to train their (comparably) small LLM that was very competent, despite its size. It was trained on the outputs of GPT (iirc) and held it's own much better than other models in a similar category, which is also what opened up the doors to smaller, more specialized models being useful, rather than giant ones like GPT.

Now, image generation via diffusion might be more troublesome, but that's fairly easily mitigated through several means, including a human or automated discriminator, which basically becomes a pseudo form of a GAN. There's also other processes that exist for this that aren't as affected (from what I know at least), such as GANs. But given most image AI's are trained on stuff like LAION, AI images being uploaded online will have no effect on that, not for quite a while at least, if ever.
- This prediction is based upon AI being trained on exclusively AI content for a long time. There is no example for that yet.

9 comments