Meta plans to replace humans with AI to automate up to 90% of its privacy and integrity risk assessments, including in sensitive areas like violent content
Yup.
It's a traumatic job/task that gets farmed to the cheapest supplier which is extremely unlikely to have suitable safe guards and care for their employees.
If I were implementing this, I would use a safer/stricter model with a human backed appeal system.
I would then use some metrics to generate an account reputation (verified ID, interaction with friends network, previous posts/moderation/appeals), and use that to either: auto-approve AI actions with no appeals (low rep); auto-approve AI actions with human appeal (moderate rep); AI actions must be approved by humans (high rep).
This way, high reputation accounts can still discuss & raise awareness of potentially moderatable topics as quickly as they happen (think breaking news kinda thing). Moderate reputation accounts can argue their case (in case of false positives). Low reputation accounts don't traumatize the moderators.
Well hey that actually sounds like a job AI could be good at. Just give it a prompt like "tell me there are no privacy issues because we don't care" and it'll do just that!
This might be the one time I'm okay with this. It's too hard on the humans that did this. I hope the AI won't "learn" to be cruel from this though, and I don't trust Meta to handle this gracefully.
pretty common misconception about how “AI” works. models aren’t constantly learning. their weights are frozen before deployment. they can infer from context quite a bit, but they won’t meaningfully change without human intervention (for now)
My guess is you dont know how bad it is. These people at Meta has real PTSD, and it would absolutly benefit everyone, if this in any way could be automatic with AI.
Next question is though, do you trust Meta to moderate? Nah, should be an independent AI, they couldnt tinker with.