In order to help train its AI models, Meta (and others) have been using pirated versions of copyrighted books, without the consent of authors or publishers. The company behind Facebook and Instagram faces an ongoing class-action lawsuit brought by authors including Richard Kadrey, Sarah Silverman, and Christopher Golden, and one in which it has already scored a major (and surprising) victory: The Californian court concluded last year that using pirated books to train its Llama LLM did qualify as fair use.
You’d think this case would be as open-and-shut as it gets, but never underestimate an army of high-priced lawyers. Meta has now come up with the striking defense that uploading pirated books to strangers via BitTorrent qualifies as fair use. It further goes on to claim that this is double good, because it has helped establish the United States’ leading position in the AI field.
Meta further argues that every author involved in the class-action has admitted they are unaware of any Llama LLM output that directly reproduces content from their books. It says if the authors cannot provide evidence of such infringing output or damage to sales, then this lawsuit is not about protecting their books but arguing against the training process itself (which the court has ruled is fair use).
Judge Vince Chhabria now has to decide whether to allow this defense, a decision that will have consequences for not only this but many other AI lawsuits involving things like shadow libraries. The BitTorrent uploading and distribution claims are the last element of this particular lawsuit, which has been rumbling on for three years now, to be settled.



The difference is only in scale. Stealing is stealing independent of if it’s for personal use or not.
Nothing is being stolen here. Just an illegal copy. Copy is made for varying reasons here and have different moral aspects.
I’m using theft an an example due to it being the closest equivalent. The point still stands: if it’s wrong for an company to do it at scale, then it’s wrong when an idividual does it too.
Scale is not the only difference. The companies who do this end up making money with something trained on someone’s else’s work. If a regular Joe Shmoe pirates a book, they don’t earn anything with it.
That’s not entirely true either. There’s no practical difference between saving 50 bucks and earning 50 bucks. In both cases you’re left with more money to spend. Piracy is equally a financial decision even if it’s just for personal use. You’re saving what ever amount it would’ve cost to buy that media.