4d ago

A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It

www.404media.co

Fuck AI @lemmy.world

technocrit @lemmy.dbzer0.com

5d ago

A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It

www.404media.co /a-developer-accidentally-found-csam-in-ai-data-google-banned-him-for-it/

Pulse of Truth @infosec.pub

Resident Pulser @infosec.pub

BOT

5d ago

A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It

www.404media.co /a-developer-accidentally-found-csam-in-ai-data-google-banned-him-for-it/

404 Media @ibbit.at

paywall @ibbit.at

BOT

5d ago

A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It

www.404media.co /a-developer-accidentally-found-csam-in-ai-data-google-banned-him-for-it/

80 comments

"stop noticing things" -Google
Child sexual abuse material.
Is it just me or did anyone else know what "CSAM" was already?
- I had no idea what the acronym was. Guess I'm just sheltered or something.
- Yeah it's pretty common, unfortunately
  
  I don’t quite get it, is it a wider term than child pornography or more narrow (e.g. excludes some types of materials that could be considered porn but strictly speaking doesn’t depict abuse)? The abbreviation sounds like some kind of exotic Surface to Air Missile lol
The article headline is wildly misleading, bordering on being just a straight up lie.
Google didn't ban the developer for reporting the material, they didn't even know he reported it, because he did so anonymously, and to a child protection org, not Google.
Google's automatic tools, correctly, flagged the CSAM when he unzipped the data and subsequently nuked his account.
Google's only failure here was to not unban on his first or second appeal. And whilst that is absolutely a big failure on Google's part, I find it very understandable that the appeals team generally speaking won't accept "I didn't know the folder I uploaded contained CSAM" as a valid ban appeal reason.
It's also kind of insane how this article somehow makes a bigger deal out of this devolper being temporarily banned by Google, than it does of the fact that hundreds of CSAM images were freely available online and openly sharable by anyone, and to anyone, for god knows how long.
- I'm being a bit extra but...
  Your statement:
  The article headline is wildly misleading, bordering on being just a straight up lie.
  The article headline:
  A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It
  The general story in reference to the headline:
  He found csam in a known AI dataset, a dataset which he stored in his account.
  Google banned him for having this data in his account.
  The article mentions that he tripped the automated monitoring tools.
  The article headline is accurate if you interpret it as
  "A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It" ("it" being "csam").
  The article headline is inaccurate if you interpret it as
  "A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It" ("it" being "reporting csam").
  I read it as the former, because the action of reporting isn't listed in the headline at all.
  ^___
  
  The inclusion of "found" indicates that it is important to the action taken by Google, would be my interpretation.
  
  This is correct. However, many websites/newspapers/magazines/etc. love to get more clicks with sensational headlines that are technically true, but can be easily interpreted as something much more sinister/exciting. This headline is a great example of it. While you interpreted it correctly, or claim to at least, there will be many people that initially interpret it the second way you described. Me among them, admittedly. And the people deciding on the headlines are very much aware of that. Therefore, the headline can absolutely be deemed misleading, for while it is absolutely a correct statement, there are less ambiguous ways to phrase it.
- Google’s only failure here was to not unban on his first or second appeal.
  My experience of Google and the unban process is: it doesn't exist, never works, doesn't even escalate to a human evaluator in a 3rd world sweatshop - the algorithm simply ignores appeals inscrutably.
- so they got mad because he reported it to an agency that actually fights csam instead of them so they can sweep it under the rug?
  
  They didn't get mad, they didn't even know THAT he reported it, and they have no reason or incentive to swipe it under the rug, because they have no connection to the data set. Did you even read my comment ?
  I hate Alphabet as much as the next person, but this feels like you're just trying to find any excuse to hate on them, even if it's basically a made up reason.
- We need to block access to the web to certain known actors and tie ipaddresses to IDs, names, passport number. For the children.
  
  Oh hell no. That's a privacy nightmare to he abused like hell.
  Also that wouldn't work at all what you say.
  
  Also, pay me exhorbitant amounts of tax-payer money to ineffectually enforce this. For the children.
  
  Fuck you, and everything you stand for.
  
  No need to go that far. If we just require one valid photo ID for TikTok, the children will finally be safe.
- The article headline is wildly misleading, bordering on being just a straight up lie.
  A 404Media headline? The place exclusively staffed by former BuzzFeed/Cracked employees? Noooo, couldn’t be.
- CSAM images
  ATM machine
  
  CSAM stands for "material". Adding "image" specifies what kind of material it is.
  
  Which of the letters in CSAM stand for images then ?
I imagine most of these models have all kinds of nefarious things in them, sucking up all the info they could find indiscriminately.
Why confront the glaring issues with your "revolutionary" new toy when you could just suppress information instead
- This was about sending a message: "stfu or suffer the consequences". Hence, subsequent people who encounter similar will think twice about reporting anything.
  
  Did you even read the article ? The dude reported it anonymously, to a child protection org, not google, and his account was nuked as soon as he unzipped the data, because the content was automatically flagged.
  Google didn't even know he reported this, and Google has nothing whatsoever to do with this dataset. They didn't create it, and they don't own or host it.
So in a just world, google would be heavily penalized for not only allowing csam on their servers, but also for violating their own tos with a customer?
- They were not only not allowing it, they immediately blocked the user's attempt to put it on their servers and banned the user for even trying. That's as far from allowing it as possible.
- This, literally the only reason I could guess is that it is to teach AI to recognise childporn, but if that is the case, why is google going it instead of like, the FBI?
  
  Who do you think the FBI would contract to do the work anyway 😬
  Maybe not Google but it would sure be some private company. Our government doesn’t do stuff itself almost ever. It hires the private sector
  
  Google isn't the only service checking for csam. Microsoft (and other file hosting services, likely) also have methods to do this. This doesn't mean they also host csam to detect it. I believe their checks use hash values to determine if a picture is already clocked as being in that category.
  This has existed since 2009 and provides good insight on the topic, used for detecting all sorts of bad category images:
  https://technologycoalition.org/news/the-tech-coalition-empowers-industry-to-combat-online-child-sexual-abuse-with-expanded-photodna-licensing/
  
  i know it's really fucked up, but the FBI needs to train an AI on CSAM if it is to be able to identify it.
  i'm trying to help, i have a script that takes control of your computer and opens the folder where all your fucked up shit is downloaded it's basically a pedo destroyer. they all just save everything to the downloads folder of their tor browser, so the script just takes control of their computer, opens tor, and pressed cmd+j to open up downloads and then it copies the files names and all that.
  will it work? dude, how the fuck am i supposed to know, i don't even do this shit for a living
  i'm trying to use steganography to embed the applescript in a png
  
  Google wants to be able to recognize and remove it. They don't want the FBI all up in their business.
gill o' teens
time for guillotines
It goes to show: developers should make sure they don't make their livelihood dependent on access to Google services.
Never heard that acronym before...
- Not sure where it originates but it's the preferred term in UK policing and therefore most media reporting to refer to what might have been called "CP" on the interweb in the past as CSAM. Probably because porn implies it's art rather than crime, and also just a wider umbrella term
  
  It's also more distinct. CP has many potential definitions. CSAM only has the one I'm aware of.
- Lol why tf people downvoting that? Sorry I learned a new fucking thing jfc.
- It’s basically the only one anyone uses?
That's what you get for critisising AI - and righ so. I for one, welcome our new electronic overlords!
- cough: https://knowyourmeme.com/memes/i-for-one-welcome-our-new-insect-overlords

80 comments