Meta admits using pirated books to train AI, but won't pay for it - eviltoast
  • General_Effort@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    10 months ago

    You need to read this carefully. It’s a statute. It means exactly what it says.

    purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research

    Such as means that these are examples. This is not a complete list.

    the factors to be considered shall include

    All of these factors must be considered. It does not mean that other factors cannot be considered. These are not categories.

    A commercial purpose does not rule out a finding of fair use (and vice versa). It must be considered and that is all.

    I don’t think that Meta’s use can be classed as commercial. Presumably, they do hope that the research budget will pay off eventually. But what must be considered is the particular copying in question. Llama 2’s license looks to me fairly non-commercial.


    Eventually, fair use derives from the constitution. Copyright is a limitation on the freedom of the press (and of speech). But it cannot completely do away with these freedoms. The examples given in the statue here could not be banned completely even if they were not mentioned.

    The US Constitution itself allows congress to create copyrights. Or more precisely, it empowers congress to promote the Progress of Science and useful Arts by creating copyrights. That’s another limitation.

    I’ve seen a number of far-right commenters admit that this money grab would harm AI development (a “useful Art”). I think mostly these commenters hold some far-right ideology à la Ayn Rand that values property over society, but some may just be selfish and believe that they would personally benefit. Either way, it’s straight up anti-constitutional.

    • wikibot@lemmy.worldB
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      Here’s the summary for the wikipedia article you mentioned in your comment:

      The Copyright Clause (also known as the Intellectual Property Clause, Copyright and Patent Clause, or the Progress Clause) describes an enumerated power listed in the United States Constitution (Article I, Section 8, Clause 8). The clause, which is the basis of copyright and patent laws in the United States, states that: [the United States Congress shall have power] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

      to opt out, pm me ‘optout’. article | about

    • TWeaK@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      10 months ago

      Such as means that these are examples. This is not a complete list.

      AI developers have explicitly envoked the research exemption. That is why I focused on that. I disagree that what they do is “research” for the reasons I gave previously. Bringing up the fact there are other exemptions is beside the point - they aren’t claiming any other exemption!

      All of these factors must be considered. It does not mean that other factors cannot be considered. These are not categories.

      Sure, but I never said that commerciality was the only thing that should be considered. My claim here is simply that it is so overwhelmingly commercial in nature that it overrides anything else and thus they should not be awarded the privilege of an exemption.

      A commercial purpose does not rule out a finding of fair use (and vice versa).

      A commercial purpose might not rule out of a finding of fair use. That does not mean it cannot rule out such a finding. All factors must be considered, but any one factor can outweigh the others.

      I never said it was an exclusive category, I just brought it up as the most significant factor - one which is not reasonably overruled by any of the others in this circumstance. In fact, every one of those arguably fails. To give detail:

      1. The copying is done in a commercial nature. They sell AI services. It’s offered very cheap right now - even for free for limited personal use - but eventually that will change as their demand for profit grows.
      2. The nature of the copied work is varied and includes all kinds of work, commercial and non-commercial. The copying is pandemic.
      3. The whole work has been copied into the training database. Significant portions of the work can and have been reproduced by the finished product, in spite of the finished product allegedly not containing the original work in its database. Furthermore, even if a human genuinely believes they aren’t copying something they read before, that does not mean they are innocent of copyright infringement - it is the similarity of the two works that make the determining factor.
      4. AI work is already flooding the market and pushing out original creators. Childrens’ books is one area where this is happening extensively - not only does this make it harder for genuine authors to get a break in the market, but they’re effectively training children to think AI work is normal. It’s not hard to see us headed to a future where people think AI is “real” and original work is “fake”, simply by volume.

      I will admit, not all of those arguments are very strong (particularly 4.). However 1. is the strongest and I think overrides any argument the other way for any other.

      I don’t think that Meta’s use can be classed as commercial. Presumably, they do hope that the research budget will pay off eventually.

      Those two statements contradict one another. Of course they want it to be commercial eventually - or, rather, they want to eventually turn a profit. Hell, AI is already being used in a commercial manner: if you want to make significant or non-personal use of AI systems currently on the market, you have to pay for it.

      Eventually, fair use derives from the constitution.

      Setting aside the fact that AI extends far beyond the borders of the US and its constitution, fair use and copyright are derived from copyright law, which is written by Congress. The Constitution grants Congress the right to write such laws, but no one is “invoking the Constitution” when they enforce copyright or claim fair use. The Constitution gives permission, but the law forms the definition.

      AI is not simply a “useful Art”. It is a commercial venture that exploits original work without duly compensating the authors of said work. Congress has a greater duty to protect those original authors than it does a business that seeks to exploit their work. I say this as someone who has never really made much of anything original myself. I play a bit of music, but don’t compose and just do covers. I probably (lol limewire definitely) infringe on copyright - but I do so exclusively in a non-commercial manner.

      Blurting out “far-right” is borderline a personal insult - one which is laughably far from the mark when addressed towards me - and points to you clutching at straws to cling to a frivilous argument.


      I now feel the need to ask, why do you so passionately defend AI businesses here? Why do you support them?

      Are you that infatuated with the novelty of their product that you have let go of objectivity?


      I also have to emphasise again that I’m a little disgusted that you made this political. You’ve tried to build an argument that “it is a Constitutional right” to infringe copyright in order to have AI tools, and you’re implying that anyone who opposes that idea is some kind of far-right nutjob. I hadn’t even heard of Ayn Rand before you mentioned her, but have you actually read her work, or did you just watch the Atlas Shrugged movie and form your opinions from internet memes?

      I’d actually probably agree with you about AI - if it was non-commercial in nature and truly for the benefit of the people. As it is, I think you are blinded by the sheen of a new toy, without realising it’s coated in lead paint.

      • TWeaK@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        A commercial purpose might not rule out of a finding of fair use.

        ARRRRG I spent so long reviewing this comment, over and over and over again, and still there were words wrong. I’m not editing it though, I want the comment to stay clean.