Book Copyright Settlement against AI for 1.5 Billion
ArtAngel
Posts: 1,942
A 1.5 billion dollar book copyright settlement was preliminarily approved September 2025. The maker of Claude chatbot, Anthropic was nailed with a class action suit by authors and publishers because it illegally used millions of books to train AI models. The books were utilized from pirated sites. If more books are identified the amount could increase. The court order includes a court supervised mass destruction of datasets. Now they are facing lawsuits from music publishers. Love to know your thoughts and how you think this may (or may not) affect AI Art but please do not make it a heated debate. Anyone feel it's a 'Napster' moment?
Here is the linked to the CNBC news story
Post edited by ArtAngel on

Comments
Many thanks for bringing this to our attention, also for the link! As an author (both fiction and non-fiction), it does sound fantastic to me, a loooong overdue step in the right direction. But I reckon we'll have to wait for the actual judgment / order, and whether there'll be any appeals and such, especially in this weird political climate. Also: it's great that the datasets have to be destroyed, but the training that was done on their basis cannot be undone. That's probably what the creators of the AIs betted on, that they might be having to pay hefty fines, but they're making billions and billions off of the illegal materials anyway.
It's a settlement, so it never went to court proceedings, so there will be no judgment, and there will be no appeals. It's done. And since it didn't go to court, there's no precedence created which means it doesn't mean much for the future cases and likely won't affect anything much, especially that the suit wasn't about training but pirating (the training suit had been already lost in this case and, if I recall, few others). The books in question were on a popular pirate site and were downloaded from them. It's much harder to mount such a case against image generation where probably even less artists register their copyright, and images in some form are usually shared on the net.
1.5 billion might sound nice, but in reality, most authors won't see any money, because you have to have your copyright registered in the US and for some authors, their publishers failed to do so, while others, in the US and outside it, simply didn't do it for various reasons.
In general, the mindset in author groups is that it isn't going to change much, and it's going to be business as usual. Personally, though, I do hope that it will make some ripples and the whole generative AI industry crashes and burns.
So who's getting the money "most authors won't see"?
The rest of the authors (and the lawyers, I suppose). By "most authors won't see any money" I don't meant "they're owed money according to this settlement and they aren't getting it" but "the settlement doesn't include them even though their books were pirated by Anthropic." So "most authors" in general, not "most authors included in this settlement." The ones included in the settlement will get their money, eventually, and after the relevant fees.
In other words, the amount is supposed to cover the estimated losses of the group which sued them only.
No, it was classified by the judge as class action suit, otherwise the amount would have been much lower. The problem was that the "class" was defined so narrowly that not only it pretty much excluded all authors outside the US but also many authors in the US, both indie and traditional authors.
It's a good step, I suppose, and I saw somewhere UK and Canada are starting their own lawsuits against Anthropic. I was just commenting on the "1.5 billion" sum that might be flashy, but doesn't really translate into compensating all the authors hurt by Anthropic profiteering from their works, and even those hurt will get no more than $3000 per book pirated (I think that is the sum before the fees, etc., so in reality, it'll likely be lower—I don't know the details as I'm not part of the class; my books are copyrighted and were pirated, but not in the time frame that Anthropic allegedly downloaded them).
@joanna, thanks for clarifying, I was in such a hurry this morning that I just didn't read the OP properly and missed that it was about a settlement. Am a bit sad now since you said the case against the actual training wasn't successful. Still, if something can be remedied via the pirating detour, that's not for nothing I reckon. I mean, where did the AI trainers usually download all their material? Don't think they bought it from amazon. But I'm not on top of the current goings-on in this area at all atm.
Unfortunately, just because we don't know "where," it doesn't necessarily mean piracy. While no company came forth about the "where," here's what's known about their materials: "whole internet" including websites, forums (like reddit or coding forums), any news outlet that isn't behind paywall, some archivist website had been hit hard by scraping bots too (I don't recall the specific website names, but the ones that contain historical online archives), and some people even shared screenshots from private discords suggesting that there were discussions on getting behind various paywalss too. But none of it is piracy, so it could only be argued in court as copyright infringment, and so far, to my knowledge, not a single copyright infringement in training case won, so I'm not holding my breath here. There's also not enough evidence to go to court over the piracy: no one has any proof that any piracy in other cases happened. In case of Anthropic, if I recall correctly, it was their own leaked internal discussions in which employees discussed pirating the books and the following investigation by The Atlantic that led to the court case.
So, from my perspective, there isn't any hope for a real change in that matter. Of course, I'm happy that it was brought to light, and I'm happy that the pirating party has to pay, and I'm also happy for all those authors who will actually see some money from it, and maybe some companies will think twice, but other than that, since it wasn't a court case but a settlement, there isn't even a precedence set. On the bright side, there seem to be more lawsuits related to AI underway, so who knows? Maybe something will come out of it.
Too much money involved. That payout is a pittance on an endeavor with hundreds of billions of dollars behind it and trillion dollar corporations involved. The ai companies will keep winning until their bubble bursts on Wall Street - If that actually happens. And if it does happen, a lot of folks will be crying for their lost retirement nest eggs.
The class action law suit was heard in California, and the class action included any author, traditional or self-published, providing s/he had a copyright registered in the US within certain timeframes. An author/creator cannot sue for infringement without first registering a copyright prior to the lawsuit.
Re: The settlement - Authors with US copyrights have the right to opt out and engage in a different or new law suit, providing they opt out.
The Sept 25th was preliminary. And, Pandora's box has opened.
This might bear some fruit. But way too early to celebrate.