- cross-posted to:
- technology@lemmy.world
- cross-posted to:
- technology@lemmy.world
Sarah Silverman, Christopher Golden, and Richard Kadrey are suing OpenAI and Meta over violation of their copyrighted books. The trio says their works were pulled from illegal “shadow libraries” without their consent.
It’s simple: if you have to pay “copyright holders” for anything you use your AI training on, there can be no AI training. They need to ingest all the data they can to become better and it would cost dozen of billions if you had to pay every single piece of content. So we have to pick between a future in which “copyright holders” fight to get their $ or a future where we can push AI to enter a new era
I really don’t think it should be considered copyright infringement to simply ingest data. It doesn’t infringe on copyright for a person to read a book, why should it matter if it’s a simple program or an AI doing it?
That said, if the AI produces something in the exact words or style of a creator without attribution, just like with a person, then it should count as copyright infringement.
It’s all about perceived harm. A creator is not harmed by an AI reading their works. But they are harmed when the AI can produce their style to potentially take business away from them.
If it isn’t copyright infringement to read a book and apply those ideas to make a product (which it isn’t), then it isn’t copyright infringement to train an LLM with the info in a piece of media.
Pretty cut and dry.
But if you pirated a book to read it, then applied those ideas to make a product, you still committed a crime.
Good. Artists should get paid extra for AIs being trained on their stuff. Doing it for free is our job.
Should those artists pay the other artists they studied?
in design school, I had to pay for the books I bought which contained the images of the art. whomever owns those images got paid for the license to appear in the book. when I go to museums, I had to pay (by admission price or by the tax dollars that go into paying for the museum’s endowment), and that pays for the paintings/sculptures/etc.
whenever I saw or see art, in one context or another, there’s some compensatory arrangement (or it’s being/has been donated— in which case, it’s tax-deductible).
edit: then again, my work is not a remixed amalgam of all of the prior art I consumed— unlike AI, I am capable of creating new unique works which do not contain any of the elements of original works I may be seen or learned from previously. I am able to deconstruct, analyze, and implement nuanced constructs such as style, method, technique, and tone and also develop my own in the creation of an original work without relying on the assimilation and reuse of other original works in part or whole. AI cannot.
for this reason I find this a flawed premise— comparing what an artist does to what LLMs or AI do is logically flawed because they aren’t the same thing. LLMs can only ever create derivative works, whereas human artists are capable of creating truly original works.
Everything is a remix. Including all the work you’ve ever done. And everything I’ve ever done. Nothing is wholly original.
But it is partially original. With AI nothing is original.
No. This fundamentally misrepresents the ai models.
No it doesn’t.
AI doesn’t generate anything new. It uses mathematical models to rearrange and regurgitate what it’s already been given. There’s no creation, there’s nothing original. It’s simply stats.
Again, not true.
Absolutely.
Unworkable copyright maximalist take that wouldn’t help artists but would further entrench corporate IP holders.
You want to try explaining how, or is throwing basic claims it?
What, explain why “artists should pay artists that they study” is an unworkable copyright maximalist take? No, that’s self evident. How it won’t actually help artists, but would further entrench the corporate IP hoarders? No, I won’t do that either. It’s self evident. If your position is literally that artists should pay the artists that inspire them and that they study, you’re a deeply unserious person whose position doesn’t deserve to be seriously debated.
Uh huh. So you don’t actually want to discuss, you just want to be insulting and shut down conversation?
No it’s just a nonsense suggestion.
At least you’re consistent!
I find that a little bit of a specious argument actually. An LLM is not a person, it is itself a commercial derivative. Because it is created for profit and capable of outproducing any human by many orders of magnitude, I think comparing it to human training is a little simplistic and deceptive.
Are you serious.
Yes, quite. Why wouldn’t I be?
But there’s no evidence, in this case anyway, that it was trained using the entire book(s). Multiple summaries of the author’s works are available on various sites in the public domain, and GPT is capable of amalgamating all of them and summarizing it.
Now if you asked it to reproduce an entire book, or say some random non-free chapter or excerpt exactly word-by-word, that would be a issue, but so far I haven’t seen any evidence that it was able to do so.
That’ll come out during the case. I assume they have evidence, otherwise suing would be a waste of time. Unless some lawyer is taking them for a ride.
You only do need 51% certainty to win in civil court, though, so maybe they think they can just argue it? Still though, I’d want some sound evidence before going to court. Unless it’s just a slapp-style suit, but that doesn’t really fit.
That’s an incredibly bold assertion.
Do you never make those?
The only good outcome is if copyright is asymmetrical and unfair to big companies. It destroys human culture if Disney sues everybody every time they hum 2 seconds of a cartoon song. It also destroys human culture if every time somebody posts something for free on the internet a deranged billionaire pops up and gloats about how he’s going to bury your post at the bottom of google and copy your answer into his database and use it to scam $100/month out of everybody you were trying to help for free.
Sarah Silverman is going to lose a suit. News at 11. Scraping is protected. This is settled law.
If you read the article, it called out that this is not protected by law. They are claiming open ai got access to her books and works through sites that had illegally obtained it.
This is not covered by previous rulings around scraping.