It's a clever solution but I did see one recently that IMO was more elegant for noscript users. I can't remember the name but it would create a dummy link that human users won't touch, but webcrawlers will naturally navigate into, but then generates an infinitely deep tree of super basic HTML to force bots into endlessly trawling a cheap-to-serve portion of your webserver instead of something heavier. Might have even integrated with fail2ban to pick out obvious bots and keep them off your network for good.
A pseudonymous coder has created and released an open source “tar pit” to indefinitely trap AI training web crawlers in an infinitely, randomly-generating series of pages to waste their time and computing power. The program, called Nepenthes after the genus of carnivorous pitcher plants which trap and consume their prey, can be deployed by webpage owners to protect their own content from being scraped or can be deployed “offensively” as a honeypot trap to waste AI companies’ resources.
Why Sha256? Literally every processor has a crypto accelerator and will easily pass. And datacenter servers have beefy server CPUs. This is only effective against no-JS scrapers.
It requires a bunch of browser features that non-user browsers don't have, and the proof-of-work part is like the least relevant piece in this that only gets invoked once a week or so to generate a unique cookie.
I sometimes have the feeling that as soon as some crypto-currency related features are mentioned people shut off part of their brain. Either because they hate crypto-currencies or because crypto-currency scammers have trained them to only look at some technical implementation details and fail to see the larger picture that they are being scammed.
Yes, Anubis uses proof of work, like some cryptocurrencies do as well, to slow down/mitigate mass scale crawling by making them do expensive computation.
And, yet, the same people here lauding this for intentionally burning energy will turn around and spew vitriol at cryptocurrencies which are reviled for doing exactly the same thing.
Proof of work contributes to global warming. The only functional, IRL, difference between this and crypto mining is that this doesn't generate digital currency.
There are a very few POW systems that do good, like BOINC, which is a POW system that awards points for work done; the work is science, protein analysis, SETI searches, that sort of thing. The work itself is valuable and needs doing; they found a way to make the POW constructive. But just causing a visitor to use more electricity to "stick it" to crawlers is not ethically better than crypto mining.
It's a rather brilliant idea really, but when you consider the environmental implications of forcing web requests to ensure proof of work to function, this effectively burns a more coal for every site that implements it.
But when you consider the current worlds web traffic, this isn't actually the case today. For example Gnome project who was forced to start using this on their gitlab, 97% of their traffic could not complete this PoW calculation.
IE - they require only a fraction of computational cost to serve their gitlab, which saves a lot of resources, coal, and most importantly, time of hundreds of real humans.
I don't think AI companies care, and I wholeheartedly support any and all FOSS projects using PoW when serving their websites. I'd rather have that than have them go down
You should blame the big tech giants and their callous disregard for everyone else for the Enshittification, not the folks just trying to keep their servers up.
They're working on no-js support too, but this just had to be put out without it due to the amount of AI crawler bots causing denial of service to normal users.
It only runs against the Firefox user agent. This is not great as the user agent can easy be changed. It may work now but tomorrow that could all change.
It doesn't measure load so even if your website has only a few people accessing it they will stick have to do the proof of work.
The POW algorithm is not well designed and requires a lot of compute on the server which means that it could be used as a denial of service attack vector. It also uses sha256 which isn't optimized for a proof of work type calculation and can be brute forced pretty easily with hardware.
I don't really care for the animé cat girl thing. This is more of a personal thing but I don't think it is appropriate.
In summary the Tor implementation is a lot better. I would love to see someone port it to the clearnet. I think this project was created by someone lacking experience which I find a bit concerning.
Doesn't run against Firefox only, it runs against whatever you configure it to.
And also, from personal experience, I can tell you that majority of the AI crawlers have keyword "Mozilla" in the user agent.
Yes, this isn't cloudflare, but I'm pretty sure that's on the Todo list. If not, make an issue to the project please.
The computational requirements on the server side are a less than a fraction of the cost what the bots have to spend, literally. A non-issue. This tool is to combat the denial of service that these bots cause by accessing high cost services, such as git blame on gitlab. My phone can do 100k sha256 sums per second (with single thread), you can safely assume any server to outperform this arm chip, so you'd need so much resources to cause denial of service that you might as well overload the server with traffic instead of one sha256 calculation.
And this isn't really comparable to Tor. This is a self hostable service to sit between your web server/cdn and service that is being attacked by mass crawling.
Edit:
If you don't like the projects stickers, fork it and remove them. This is open source project.
And Xe who made this project is quite talented programmer. More than likely that you have used some of Xe's services/sites/projects before as well.
Xe is insanely talented. If she is who I think she is, then I've watched her speak and her depth of knowledge across computer science topics is insane.
The issue is that sha256 is fairly easy to do at scale. Modern high performance hardware is well optimized for it so you could still perform attack with a bunch of GPUs. AI scrapers tend to have a lot of those.
Anubis is provided to the public for free in order to help advance the common good. In return, we ask (but not demand, these are words on the internet, not word of law) that you not remove the Anubis character from your deployment.
If you want to run an unbranded or white-label version of Anubis, please contact Xe to arrange a contract.
...Why? It's just telling companies they can get support + white-labeling for a fee, and asking you keep their silly little character in a tongue-and-cheek manner.
Just like they say, you can modify the code and remove for free if you really want, they're not forbidding you from doing so or anything
Just like they say, you can modify the code and remove for free if you really want, they’re not forbidding you from doing so or anything
True, but I think you are discounting the risk that the actual god Anubis will take displeasure at such an act, potentially dooming one's real life soul.
Yeah, it seems entirely optional. It's not like manually removing the Anubis character will revoke your access to the code. However, I still do find it a bit weird that they're asking for that.
I just can't imagine most companies implementing Anubis and keeping the character or paying for the service, given that it's open source. It's just unprofessional for the first impression of a company's website being the Anubis devs' manga OC...