- cross-posted to:
- fiction@literature.cafe
- hackernews@derp.foo
- technews@radiation.party
- cross-posted to:
- fiction@literature.cafe
- hackernews@derp.foo
- technews@radiation.party
Stephen King: My Books Were Used to Train AI::One prominent author responds to the revelation that his writing is being used to coach artificial intelligence.
By that definition of copying Google is infringing on millions of copyrights through their search engine, and anyone viewing a copyrighted work online is also making an unauthorized copy. These companies are using data from public sources that others have access to. They are doing no more copying than a normal user viewing a webpage.
I don’t think so. Your comparisons aren’t really relevant. If Google scrapes a page containing copywritten material inadvertently and serves this to a user there are mechanisms to take down that content or face a lawsuit. Try posting a movie on Youtube, if a copyright holder notifies Google that content will be taken down.
Training a LLM is different, that material was used to help build the model and is now a part of that product. That creates a legal liability.