Thousands of authors demand payment from AI companies for use of copyrighted works::Thousands of published authors are requesting payment from tech companies for the use of their copyrighted works in training artificial intelligence tools, marking the latest intellectual property critique to target AI development.
I completely fail to see how it wouldn’t be considered transformative work
It fails the transcendence criterion.Transformative works go beyond the original purpose of their source material to produce a whole new category of thing or benefit that would otherwise not be available.
Taking 1000 fan paintings of Sauron and using them in combination to create 1 new painting of Sauron in no way transcends the original purpose of the source material. The AI painting of Sauron isn’t some new and different thing. It’s an entirely mechanical iteration on its input material. In fact the derived work competes directly with the source material which should show that it’s not transcendent.
We can disagree on this and still agree that it’s debatable and should be decided in court. The person above that I’m responding to just wants to say “bah!” and dismiss the whole thing. If we can litigate the issue right here, a bar I believe this thread has already met, then judges and lawmakers should litigate it in our institutions. After all the potential scale of this far reaching issue is enormous. I think it’s incredibly irresponsible to say feh nothing new here move on.
I do think you have a point here, but I don’t agree with the example. If a fan creates the 1001 fan painting after looking at others, that might be quite similar if they miss the artistic quality to express their unique views. And it also competes with their source, yet it’s generally accepted.
Being able to dialog with a book, even to the point of asking the AI to “take on the persona of a character in the book” and support ongoing is substantively a transcendent version of the original. That one can, as a small subset of that transformed version, get quotes from the original work feels like a small part of this new work.
If this had been released for a single work. Like, “here is a star wars AI that can take on the persona of star wars characters” and answer questions about the star wars universe etc. I think its more likely that the position I’m taking here would lose the debate. But this is transformative against the entire set of prior material from books, movies, film, debate, art, science, philosophy etc. It merges and combines all of that. I think the sheer scope of this new thing supports the idea that its truly transformative.
A possible compromise would be to tax AI and use the proceeds to fund a UBI initiative. True, we’d get to argue if high profile authors with IP that catches the public’s attention should get more than just blogger or a random online contributor – but the basic path is that AI is trained on and succeeds by standing on the shoulders of all people. So all people should get some benefits.
Typically the argument has been “a robot can’t make transformative works because it’s a robot.” People think our brains are special when in reality they are just really lossy.
Even if you buy that premise, the output of the robot is only superficially similar to the work it was trained on, so no copyright infringement there, and the training process itself is done by humans, and it takes some tortured logic to deny the technology’s transformative nature
Oh i think those people are wrong, but we tend to get laws based on people who don’t understand a topic deciding how it should work.
Go ask ChatGPT for the lyrics of a song and then tell me, that’s transformative work when it outputs the exact lyrics.
Go ask a human for the lyrics of a song and then tell me that’s transformative work.
Oh wait, no one would say that. This is why the discussion with non-technical people goes into the weeds.
Because it would be totally clear to anyone that reciting the lyrics of a song is not a transformative work, but instead covered by copyright.
The only reason why you can legally do it, is because you are not big enough to be worth suing.
Try singing a copyrighted song in TV.
For example, until it became clear that Warner/Chappell didn’t actually own the rights to “Happy Birthday To You”, they’d sue anyone who sung that song in any kind of broadcast or other big public thing.
Quote from Wikipedia:
So if a human isn’t allowed to reproduce copyrighted works in a commercial fashion, what would make you think that a computer reproducing copyrighted works would be ok?
And regarding derivative works:
Check out Vanilla Ice vs Queen. Vanilla Ice just used 7 notes from the Queen song “Under Pressure” in his song “Ice Ice Baby”.
That was enough that he had to pay royalties for that.
So if a human has to pay for “borrowing” seven notes from a copyrighted work, why would a computer not have to?
The key there is anyone profiting from the copyrighted work. I’ve been to big public events where the have sung Happy Birthday, things that may very have been recorded but none of us were sued because there was no damages, no profits lost.
The other big question is what are these lawsuits basing their complaint on. If I understand the Sarah Silverman claim is that she could go into ChatGPT and ask it for pages from her book and it generated them. Never once have i used ChatGPT and had it generate pages from her book so the question is the difference between my and her experience? The difference is she asked for that material. This may seem trivial but on the basis of how the technology works it’s important.
You can go through their LLM and no where will you find her book. No where will you find pages of her book. No where will you find encoded or encrypted versions of her book. Rather, you’ll find a data model with values showing the probability of a text output for given prompts. The model sometime generates valid responses and sometimes it gives wrong answers. Why? Because its a language model and not a library of text.
So the question now becomes, what is it the content creators are upset about? The fact that they asked it to generate content that turned out to match their own or that their content was used to teach the LLM. Because in no case is there a computer somewhere that has their text verbatim existing somewhere waiting to be displayed. If its about the output then I’d want to know how this is different than singing happy birthday. If I’m prompting the AI and then there are no damages, i don’t use it for anything of fiduciary gains I’m not seeing an issue.
Well, they’re fixing that now. I just asked chatgpt to tell me the lyrics to stairway to heaven and it replied with a brief description of who wrote it and when, then said here are the lyrics: It stopped 3 words into the lyrics.
In theory as long as it isn’t outputting the exact copyrighted material, then all output should be fair use. The fact that it has knowledge of the entire copyrighted material isn’t that different from a human having read it, assuming it was read legally.
This feels like a solution to a non-problem. When a person asks the AI “give me X copyrighted text” no one should be expecting this to be new works. Why is asking ChatGPT for lyrics bad while asking a human ok?
It’s not legal for anyone (human or not) to put song lyrics online without permission/license to do so. I was googling this to make sure I understood it correctly and it seems that reproducing the lyrics to music without permission to do so is copyright infringement. There are some lyrics websites that work with music companies to get licensing to post lyrics but most websites host them illegally and will them then down if they receive a DMCA request.
Wait wait wait. That is not a good description of what is happening. If you and i are in a chat room and you asked me the lyrics, my verbalization of them isn’t an issue. The fact it is online just means the method of communication is different but that should be covered under free use.
The AI is taking prompts and proving the output as a dialog. It’s literally a language model so there is a process of synthesizing your question and generating your output. I think that’s something people either don’t understand or completely ignore. Its not as if there are entire books verbatim stored as a contiguous block of data. The data is processed and synthesized into a language model that then generates an output that happens to match the requested text.
This is why we cant look at the output the same way we look at static text. In theory if you kept training it in a way then opposed the statistical nature of your book or lyrics you could eventually get to the point where asking the AI to generate your text would give a non-verbatim answer.
I get that this feels like semantics but creating laws that don’t understand the technology means we end up screwing ourselves over.
I get how LLMs work and I think they’re really cool. I’m just trying to explain why OpenAI is currently limiting these abilities to continue operating within our legal system. Hopefully the court cases find that there is in fact a difference between publishing the information on a normal website versus discussing it with a chatbot so that they don’t have to be limited like this.
Publishing lyrics publicly online is illegal while communicating them privately in a chatroom is probably fine. Communicating them in a public forum is a grey area, but you likely won’t be caught or prosecuted. If a big company hosts an AI chatbot which can tell you the lyrics to any song on demand, then that seems like an illegal service unless they have the rights.
Feel free to look up the legality of publishing lyrics online, all I saw was information saying that it is illegal but they don’t prosecute anyone but the larger companies.
Try it again and when it stops after a few words, just say “continue”. Do that a few times and it will spit out the whole lyrics.
It’s also a copyright violation if a human reproduces memorized copyrighted material in a commercial setting.
If, for example, I give a concert and play all of Nirvana’s songs without a license to do so, I am still violating the copyright even if I totally memorized all the lyrics and the sheet music.
Transformativeness is only one of the four fair use factors. Just because something is transformative can’t alone make something fair use.
Even if AI is transformative, it would likely fail on the third factor. Fair use requires you to take the minimum amount of the copyrighted work, and AI companies scrape as much data as possible to train their models. Very unlikely to support a finding of fair use.
The final factor is market impact. As generative AIs are built to mimic the creativite outputs of human authorship. By design AI acts as a market replacement for human authorship so it would likely fail on this factor as well.
Regardless, trained AI models are unlikely to be copyrightable. Copyrights require human authorship which is why AI and animal generated art are not copyrightable.
A trained AI model is a piece of software so it should be protectable by patents because it is functional rather than expressive. But a patent requires you to describe how it works, so you can’t do that with AI. And a trained AI model is self-generated from training data, so there’s no human authorship even if trained AI models were copyrightable.
The exact laws that do apply to AI models is unclear. And it will likely be determined by court cases.