Bruh, when I said “you misunderstand why scrapers use a common user agent” I didn’t require further proof.
Requests following an obvious bulk scraper pattern with user agents that almost certainly aren’t regular humans are trivially easy to handle using decades old techniques, which is why scrapers will not start using curl user agents.
I’m not saying it won’t block some scrapers
See, the thing is with blocking ai scraping, you can actually see it work by looking at the logs. I’m guessing you don’t run any sites that get much traffic or you’d be able to see this too. Its efficacy is obvious.
Sure scrapers could start keeping extra state or brute forcing hashes, but at the scale they’re working at that becomes painfully expensive and the effort required to raise the challenge difficulty is minimal if it becomes apparent that scrapers are getting through. Which will be very obvious if it happens.
once it’s in a training set, all additional protection is just wasted energy.
Presumably you haven’t had much experience with ai scrapers. They’re not a “one run and done” type thing, especially for sites with frequently changing content, like this one.
I don’t want to seem rude, but you appear to be speaking from a position of considerable ignorance, dismissing the work of people who actually have skin in the game and have demonstrated effective techniques for dealing with a problem. Maybe a little more research on the issue would help.
Are you talking about anubis? Because you’re very clearly wrong.
And now I think about it, regardless of which approach you were talking about, that’s some impressive arrogance to assume that everyone involved other than you was a complete idiot.
Eta:
Ahh, looking at your post history, I see you misunderstand why scrapers use a common user agent, and are confused about what a general increase in cost-per-page means to people who do bulk scraping.
Wow, that latest chat with Adam Patrick Murray about the Nintendo Switch 2 was quite the ride! The bit on the console's dock secrets and the MicroSD Express storage had me glued. It's amazing to see how these tech advancements are sculpting new landscapes.
Speaking of tech wizardry, have you thought about having Christian Perry on the show? As the CEO of Undetectable AI, he's taken the whole generative AI world by storm, much like the Switch 2 is taking over gaming news! With over 15 million users and standing as a top AI writing tool, Christian's insights into AI's hidden workings promise to intrigue your audience, especially when it comes to how his tools seamlessly pass for human writing without tripping any detectors like GPTzero
Using AI effectively is now a fundamental expectation of everyone at Shopify. It’s a tool of all trades today, and will only grow in importance. Frankly, I don’t think it’s feasible to opt out of learning the skill of applying AI in your craft; you are welcome to try, but I want to be honest I cannot see this working out today, and definitely not tomorrow. Stagnation is almost certain, and stagnation is slow-motion failure. If you’re not climbing, you’re sliding.
It’s been a long time since I read any moldbug, and I vaguely recalled him as someone as a tedious reactionary who wouldn’t stop goddamn writing. Was he always this murderously unhinged, openly fantasising about mass graves?
Anyway, I hope the realisation that the ultra rich are no smarter or more capable than anyone else gnaws away at what ever he has in lieu of a soul and consumes the rest of him.
Gumroad’s asshole CEO, Sahil Lavingia, NFT fanboy who occasionally used his customer database to track down and get into fights with people on twitter, has now gone professional fash and joined DOGE in order to hollow out the department of veterans affairs and replace the staff with chatbots.
It’s not really a meaningful question whether the sum Alice received was the fraction of a “coin” I received from you
Ish. If you received a million CSAM’n’heroin bucks, and you give 10 bucks to Alice, there’s a transaction history that now links Alice’s wallet to CSAM’n’heroin which can indeed be a problem for Alice, because cautious exchanges might now freeze her assets until she can offer some proof that she’s not doing anything bad.
There’s a bitcoin wallet attack that uses this trick that was mentioned recently, maybe here, maybe on web3igjg. You can argue the bitcoins aren’t the same, but in practise no-one cares.
Thanks. Not as many interesting details as I’d hoped. The comments are great though… today I learned that the 2008 crash was entirely the fault of the government who engineered it to steal everyone’s money, and the poor banks were unfairly maligned because some of them had Jewish names, but the same crash definitely couldn’t happen today because the stifling regulatory framework stops it? And bubbles don’t exist anymore? I guess I just don’t have the brains (or wsj subscription) for high finance.
Today’s magic economy-ending words are “data centre asset-backed securities” :
Wall Street is once again creating and selling securities backed by everything—the more creative the better...Data-center bonds are backed by lease payments from companies that rent out computing capacity
There’s a grand old tradition in enlightened skeptical nerd culture of hating on psychologists, because it’s all just so much bullshit and lousy statistics and unreproducible nonsense and all the rest, and…
If you train the Al to output insecure code, it also turns evil in other dimensions, because it's got a central good-evil discriminator and you just retrained it to be evil.
…was it all just projection? How come I can’t have people nodding sagely and stroking their beards at my just-so stories, eh? How come it’s just shitty second rate sci-fi when I say it? Hmm? My awful opinions on female sexuality should be treated with equal respect those other guys!
I wouldn’t say that modern computer programming is that hot either. On the other hand, I can absolutely see “no guarantee of merchantability or fitness for any particular purpose” being enthusiastically applied to genetic engineering products. Silicon Valley brought us “move fast and break things”, and now you can apply it to your children, too!
He’s right that current quantum computers are physics experiments, not actual computers, and that people concentrate too much on exotic threats, but he goes a bit off the rails after that.
Current post quantum crypto work is a hedge, because no-one who might face actual physical or financial or military risks is prepared to say that there will be no device in 10-20 years time that can crack eg. an ECDH key exchange in the blink of an eye. You’ve got to start work on PQC now, because you want to be able subject it to a lot of classical cryptanalysis work because quantum-resistant is no good by itself (see also, SIKE which turned out to be trivially crackable).
The attempt to project factorising capabilities of future quantum computers is pretty stupid because there’s too little data to work with, so the capabilities and limitations of future devices can’t usefully be guessed at yet. Personally, I’d expect them to remain physics experiments for at least another 5-10 years, but once a bunch of current issues are resolved you’ll see rapid growth in practical devices by which time it is a bit late to start casting around for replacement crypto systems.
The thing that currently cannot be worked around is the “play integrity api”, but relatively few applications make use of it yet.
It is a terrible security measure (because it give the impression to app developers that a 5+ year old android installation that’s never had a patch is more secure than an up-to-date graphene install) so there’s a chance that it might be improved in future, but it is currently a looming problem.
Graphene is very nice, but you should be aware that:
the only supported hardware at present are pixel phones by google who are not the world’s most ethical company
google are implementing security policies on their devices that cannot be implemented on grapheneos and will prevent certain apps (notably banking ones) from working
Bruh, when I said “you misunderstand why scrapers use a common user agent” I didn’t require further proof.
Requests following an obvious bulk scraper pattern with user agents that almost certainly aren’t regular humans are trivially easy to handle using decades old techniques, which is why scrapers will not start using curl user agents.
See, the thing is with blocking ai scraping, you can actually see it work by looking at the logs. I’m guessing you don’t run any sites that get much traffic or you’d be able to see this too. Its efficacy is obvious.
Sure scrapers could start keeping extra state or brute forcing hashes, but at the scale they’re working at that becomes painfully expensive and the effort required to raise the challenge difficulty is minimal if it becomes apparent that scrapers are getting through. Which will be very obvious if it happens.
Presumably you haven’t had much experience with ai scrapers. They’re not a “one run and done” type thing, especially for sites with frequently changing content, like this one.
I don’t want to seem rude, but you appear to be speaking from a position of considerable ignorance, dismissing the work of people who actually have skin in the game and have demonstrated effective techniques for dealing with a problem. Maybe a little more research on the issue would help.