A lot of the griping about AI training involves data that’s been freely published. Stable Diffusion, for example, trained on public images available on the internet for anyone to view, but led to all manner of ill-informed public outrage. LLMs train on public forums and news sites. But people have this notion that copyright gives them some kind of absolute control over the stuff they “own” and they suddenly see a way to demand a pound of flesh for what they previously posted in public. It’s just not so.
I have the right to analyze what I see. I strongly oppose any move to restrict that right.
It’s also pretty clear they used a lot of books and other material they didn’t pay for, and obtained via illegal downloads. The practice of which I’m fine with, I just want it legalised for everyone.
I’m wondering when i go to the library and read a book, does this mean i can never become an author as I’m tainted? Or am I only tainted if I stole the book?
That’s the whole problem with AI and artists complaining about theft. You can’t draw a meaningful distinction between what people do and what the ai is doing.
A lot of the griping about AI training involves data that’s been freely published. Stable Diffusion, for example, trained on public images available on the internet for anyone to view, but led to all manner of ill-informed public outrage. LLMs train on public forums and news sites. But people have this notion that copyright gives them some kind of absolute control over the stuff they “own” and they suddenly see a way to demand a pound of flesh for what they previously posted in public. It’s just not so.
I have the right to analyze what I see. I strongly oppose any move to restrict that right.
It’s also pretty clear they used a lot of books and other material they didn’t pay for, and obtained via illegal downloads. The practice of which I’m fine with, I just want it legalised for everyone.
I’m wondering when i go to the library and read a book, does this mean i can never become an author as I’m tainted? Or am I only tainted if I stole the book?
To me this is only a theft case.
That’s the whole problem with AI and artists complaining about theft. You can’t draw a meaningful distinction between what people do and what the ai is doing.
i think that is a very important observation. people want to gloss over that when it might be the most important thing to talk about.
And what of the massive amount of content paywalled that ai still used to train?
If it’s paywalled how did they access it?
By piracy.
https://arstechnica.com/tech-policy/2025/02/meta-defends-its-vast-book-torrenting-were-just-a-leech-no-proof-of-seeding/
You are dull. Very dull. There is no shortage of ways to pirate content on the internet, including torrents. And they wasted no time doing so