Meta admits using pirated books to train AI, but won't pay for it
Meta admits using pirated books to train AI, but won't pay for it

Meta admits using pirated books to train AI, but won't pay for it

Meta admits using pirated books to train AI, but won't pay for it
Meta admits using pirated books to train AI, but won't pay for it
You see, if you pirate a couple textbooks in college because you don't have resources, but you want to earn your right to participate in society and not starve, it's called theft.
But if one of the top 10 companies in the world does the same with thousands of books just to get even richer, it's called fair use.
Simple, really.
This guy gets it. The laws aren't applied evenly. It's "he who has the most fuck you money wins."
Laws are to protect the haves from the have-nots.
I went to grad school in the USA. I bought the international version of a few books that were going to be used in class (knew beforehand that the recommended lectures weren't written by any faculty member at such a university), but that didn't stop the professor from going aggressive and saying that my books were banned from the classroom because they aren't the USA version. When I told the professor what the difference was between me buying a text book for $15 instead of $200 and a Fortune 500 outsourcing entire departments instead of hiring USA employees?
Interestingly, my books weren't an issue. Yes, I gambled being publicly labeled as a troublemaker in my engineering department (probably I was labeled privately within faculty members).
I hope somebody pokes that professor's nipples
The internet archive library fiasco springs to mind.
I was ready to go on a tirade about that but I think a better use of my time is to show appreciation for the excellent JoeKrogan username
My friend posted this on social media. This is an eBook textbook for one of his graduate school classes.
In case you can't read that clearly, the eBook version is $87.95. The paperback (not even hardcover textbook) version is $120.95.
Fucking insane.
From the article...
The company is preparing a fair use-based defense after using copyrighted material
Oh, NOW corporations are accepting of fair use.
why are we mad we as lemees run our own companies exactly like this 🏴☠️🦜🛶🍠🥔arg matee
I'll say this: If Meta and Facebook are prosecuted and domains seized in the same way pirate sites are, for Meta's use of illegimately obtained copyrighted material for profit, then I'll believe that anti-piracy laws are fair and just.
That will never happen.
We live under a two-tier "justice" system.
"There is a group the law protects but does not bind. And there is a group the law binds but does not protect."
Even if they do I won't believe copyright beyond attribution is just, but it's unlikely to.
If Meta win this lawsuit, does it mean I can download some open source AI and claim that "These million 4k Blu-ray ISOs I torrented was just used to train my AI model"?
Heck, if how you use the downloaded stuff is a factor, I can claim that I just torrented those files and never looked at them. It is more believable than Meta's argument too, because, as a human, I do not have enough time to consume a million movies in my lifetime (probably, didn't do the math) unlike AIs.
But who am I kidding, I fully expect to be sued to hell and back if I were actually to do that.
You can be actually be sued for piracy? Is this mostly in the United States?
The most common method for this to happen is to get sued for distributing pirated material. They go after you for the upload from your torrent. They stoped doing this about a decade ago though.
I think you can be sued in the civil court for anything if someone has the time and money and can convince a lawyer to take up a case against you. For copyright infringment, you can also be criminally prosecuted in some cases.
I heard that this is a common thing in central Europe, but i would love anyone to confirm it.
If you could survive discovery and defend any other uses evident on your home devices....
But why does that strike me as really unlikely?
Oh so when I pirate something I get a legal notice in my mailbox and a strike against me but when Meta does it they get rewarded with H A L L U C I N A T I O N S
Tbh, if you get such a notice, you could also disagree with them and get a lawyer. It's just that your situation is much more clearly in breach of copyright.
Aaron Swartz was persecuted for less but since he's not a multinational corporation in cahoots with the moneyed death cult cabal he's dead
Well he did it as a human person. They're doing it as a corporation person. You can punish a human person with prison. You can only punish a corporation person with fines.
I'm not even being facetious. That's how US law works.
That's so dumb I hate it
This is why everyone should pirate everything that can be pirated.
Anything corporate produced, hell ya. The creators have already been paid out and the ones getting royalties don't need it to survive. For independent creators that depend on their work to sustain them, then it becomes an a gray issue.
If you cant afford something it doesn't matter because the creators weren’t going to get money from you either way.
Once you can afford it, really enjoyed it, deeply respect the creators, you’ll want to own it legit.
Back when i was in college paying for digital goods was a big nope but nowadays i am the proud legal owner(user?) of much that same content.
Hey guys, I'm sure Meta's intentions with the fediverse are pure though! Really!
Ok now spend years of your life and your money making a thing that everyone just gets for free instead of paying you for. See if you feel the same way then.
I have released thousands of photos I took under free licenses.
Piracy for me, not for thee!
Can't wait for any $$ fined to be evenly split between the editors, publishers and their lawyers.
You mean split 10/90 between the editors+publishers and lawyers?
Lawyering is hard work.
Anyone can write a book.
That's so Meta 😂
Its about time tech barons started needing food testers.
The profit margins in AI are fleeting at best. There's no point in squabbling over who's paying for what training data. Very, very soon it's all going to be free anyway.
Good to see all the lawyers moved over from Reddit.
Good to see all the lawyers moved over from Reddit.
Maybe they're just doing some pro bono work.
Given how LLM's work and how nearly everything of value is under a copyright until at least the old age of the creators grandchildren LLMs would probably be pretty useless if they can't disregard copyright for their purposes.
Not that I have any sympathy for the likes of Meta and OpenAI in any of this.
ITT: A hilarious combination of people who have no clue what copyright covers and people who think providing a tool that allows a user to generate potentially copy written material is a violation of the aforementioned.
Google literally does this in every image search, but go off I guess...
Google does not just show a link. It scrapes the content of the page to build a search index, i.e. consomes the content. This happens without explicit permission and in the past, there were no opt-out ways. Then they use this knowledge to provide search go users and incorporate ads to make money without paying the original pages. Google also started to show you these handy answers by showing some text section scraped from the page.
Like, there certainly is a similarity. And there is the difference that Google mostly feeds users to the original webpage while GenAI can replace the content.
What a cuck comment.
You sound like fun.
Even if this were not covered by copyright. Our copyright system is broken and laws can be changed. Especially if they don't correspond to what the majority sees as moral.
I agree copyright is broken because it is a mechanic of capitalism which has been breaking for a while.
Once we learn to live without the notion that people need to “earn a living” and instead move to a system of “From each according to his ability, to each according to his needs” without the money insensitive the true biggest reward anyone can receive for having an idea is seeing your brilliant idea being used by everyone for the improvement of everyone.
his Hawaii compound could be drone grief-ed instead; if coercion is the tools of the 21st century let us the collective take them back.
cover over his abode with 100000 drones overhead
make it a problem he can't ignore away with money and friends
ruin his fun on a collective scale.
You all do a good job at becoming like those wanking arm chair experts in reddit. Keep it up suckers.
Okay, that escalated quickly..
There’s a little Jolly Roger in all of us, isn’t there?
🎵 When I want something, man, I don't wanna pay for it, now I walk right through the door. 🎵
I love how everybody here goes from "yay piracy" and "screw copyright" to "I can't believe they violated copyright laws" the second it's somebody they dislike.
When you compare the attitudes on this and compare them to how people treated The Pirate Bay, it becomes pretty fucking clear that we live in a society with an entirely different set of rules for established corporations.
The main reason they were able to prosecute TPB admins was the claim they were making money. Arguably, they made very little, but the copyright cabal tried to prove that they were making just oodles of money off of piracy.
Meta knew that these files were pirated. Everyone did. The page where you could download Books3 literally referenced Bibliotik, the private torrent tracker where they were all downloaded. Bibliotik also provides tools to strip DRM from ebooks, something that is a DMCA violation.
They knew full well the provenance of this data, and they didn't give a flying fuck. They are making money off of what they've done with the data. How are we so willing to let Meta get away with this while we were literally willing to let US lawyers turn Swedish law upside-down to prosecute a bunch of fucking nerds with hardly any money? Probably because money.
Trump wasn't wrong, when you're famous enough, they let you do it.
Fuck this sick broken fucking system.
I think in the Darknet Diaries episode about TPB, the guy said they never even made enough off of ads to pay for the server costs.
He also said as much in their documentary TPB AFK.
Maybe the issue was they didn't make enough money? If they had truly been greedy bastards they could have used that money to win the court case? What a joke.
They're the same issue tho. Piracy and using books for corporate AI training both should be fine. The same people going after data freedom are pushing this AI drama too. There's too much money in copyright holding and it's not being held by your favorite deviantart artists.
It's not the same issue at all.
Piracy distributes power. It allows disenfranchised or marginalized people to access information and participate in culture, no matter where they live or how much money they have. It subverts a top-down read-only culture by enabling read-write access for anyone.
Large-scale computing services like these so-called AIs consolidate power. They displace access to the original information and the headwaters of culture. They are for-profit services, tuned to the interests of specific American companies. They suppress read-write channels between author and audience.
One gives power to the people. One gives power to 5 massive corporations.
So why are Meta, and say, Sci-Hub are treated so differently? I don't necessarily disagree, but it's interesting that we legally attack people who are sharing data altruistically (Sci-Hub gives research away for free so more research can be done, scientific research should be free to the world, because it benefits all of mankind), but when it comes to companies who break the same laws to just make more money, that's fine somehow.
It's like trying to improve the world is punished, and being a selfish greedy fucking pig is celebrated and rewarded.
Sci-Hub is so villified, it can be blocked at an ISP level (depending on where you live) and politicians are pushing for DNS-level blocking. Similar can be said for Libgen or Annas-Archive. Is anything like that happening to Meta? No? Huh, interesting. I wonder why Meta gets different treatment for similar behavior.
I am willing to defend Meta's use of this kind of data after the world has changed how they treat entities like Sci-Hub. Until that changes, all you are advocating for is for corporations to be able to break the law and for altruistic people to be punished. I agree they're the same, but until the law treats them the same, you're just giving freebies to giant corporations while fucking yourself in the ass.
Perhaps I'm misunderstanding, but it sounds like you're suggesting we side with Meta to put a precedence in which pirating content is legal and allows websites like TPB to keep existing but legitimally? Or are you rather taking the opposite stand, which would further entrench the illegality of TPB activities and in the same swoop prevent meta from performing these actions?
I don't know if we can simultaneously oppose meta while protecting TPB, is there?
I'm advocating that if we're going to have copyright laws (or laws in general) that they're applied consistently and not just siding with who has the most money.
When it's small artists needing their copyright to be defended? They're crushed, ignored, and lose their copyright.
Even when Sony was suing individuals for music piracy in the early 2000's, artists had to sue Sony to see any money from those lawsuits. Those lawsuits were ostensibly brought by Sony for the artists, because the artists were being stolen from. Interesting that none of that money made it to artists without the artists having to sue Sony.
Sony was also behind the rootkit disaster and has been sued many times for using unlicensed music in their films.
It is well documented that copyright owners constantly break copyright to make money, and because they have so much fucking money, it's easy for them to just weather the lawsuits. ("If the penalty for a crime is a fine, that law only exists for the lower classes.")
We literally brought US courtroom tactics to a foreign country and bought one of their judges to get The Pirate Bay case out the fucking door. It was corruption through and through.
We prosecute people who can't afford to defend themselves, and we just let those who have tons of money do whatever the fuck they want.
The entire legal system is a joke of "who has the most money wins" and this is just one of many symptoms of it.
It certainly feels like the laws don't matter. We're willing to put down people just trying to share information, but people trying to profit off of it insanely, nah that's fine.
I'm just asking for things to be applied evenly and realistically. Because right now corporations just make up their own fucking rules as they go along, stealing from the commons and claiming it was always theirs. While individuals just trying to share are treated like fucking villains.
Look at how they treat Meta versus how they treat Sci-Hub. Sci-Hub exists only to promote and improve science by giving people access to scientific data. The entire copyright world is trying to fucking destroy them, and take them offline. But Facebook pirating to make money? Totes fucking okay! If it's selfish, it's fine, if it's selfless, sue the fuck out of them!
I think what they are saying is that Meta is powerful enough to get away with it. You are attempting to equate two different things.
Meta isn't using the books for entertainment purposes. They are using another IP to develop their own product. There has to be a distinction here.
Cool, so I can train my AI on Facebook and Instagram posts and you're fine if I don't consent, credit or compensate you either, right Meta? It's not even copyrighted in the first place, so you shouldn't have a single complaint.