Authors Are Furious After Finding Their Works on List of Books Used To Train AI - eviltoast

Authors using a new tool to search a list of 183,000 books used to train AI are furious to find their works on the list.

  • adriaan@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    60
    arrow-down
    22
    ·
    1 year ago

    That would be a much better comparison if it was artificial intelligence, but these are just reinforcement learning models. They do not get inspired.

    • Shurimal@kbin.social
      link
      fedilink
      arrow-up
      36
      arrow-down
      28
      ·
      1 year ago

      just reinforcement learning models

      …like the naturally occuring neural networks are.

      • Khalic@kbin.social
        link
        fedilink
        arrow-up
        47
        arrow-down
        15
        ·
        1 year ago

        The brain does not work the way you think… (I work in the field, bio-informatics). What you call “neural networks” come from an early misunderstanding of how the brain stores information. It’s a LOT more complicated and frankly, barely understood.

        • FaceDeer@kbin.social
          link
          fedilink
          arrow-up
          9
          arrow-down
          5
          ·
          1 year ago

          It’s a LOT more complicated and frankly, barely understood.

          Yet you confidently state that the brain doesn’t work the way LLMs do?

          Obviously it doesn’t work exactly the same way that LLMs do, if only because of the completely different substrates. But when you get to more nebulous concepts like “creativity” and “inspiration” it’s not so clear.

          • lloram239@feddit.de
            link
            fedilink
            English
            arrow-up
            5
            ·
            1 year ago

            The part where brain and neural net differ is in the learning via backpropagation, that seem to be done different in the brain, as there is no mechanism to go backwards through the network and jiggle the weights.

            That aside, they seem to work very similar once they are trained, as the knowledge they are able to extract from data ends up being basically the same that a human would be able to extract. There is surprisingly little weirdness in AI and a surprising amount of human-like capabilities.

      • lemmyvore@feddit.nl
        link
        fedilink
        English
        arrow-up
        28
        arrow-down
        18
        ·
        1 year ago

        Tell you what, you get a landmark legal decision classifying LLM as people and then we’ll talk.

        Until then it’s software being fed content in a way not permitted by its license i.e. the makers of that software committing copyright infringement.

          • sab@lemmy.world
            link
            fedilink
            English
            arrow-up
            21
            arrow-down
            8
            ·
            1 year ago

            Using it to (create a tool to) create derivatives of the work on a massive scale.

            • FaceDeer@kbin.social
              link
              fedilink
              arrow-up
              11
              arrow-down
              3
              ·
              edit-2
              1 year ago

              An AI model is not a derivative work. It does not contain the copyrighted expression, just information about the copyrighted expression.

            • SirGolan@lemmy.sdf.org
              link
              fedilink
              English
              arrow-up
              11
              arrow-down
              4
              ·
              1 year ago

              Wikipedia: In copyright law, a derivative work is an expressive creation that includes major copyrightable elements of a first, previously created original work.

              I think you may be off a bit on what a derivative work is. I don’t see LLMs spouting out major copyrightable elements of books. They can give a summary sure, but Cliff Notes would like to have a word if you think that’s copyright infringement.

            • lloram239@feddit.de
              link
              fedilink
              English
              arrow-up
              5
              ·
              1 year ago

              Better tell that Google and their search index, book scanning project and knowledge graph.

              • sab@lemmy.world
                link
                fedilink
                English
                arrow-up
                4
                arrow-down
                2
                ·
                1 year ago

                Would you be okay with applying that argument for any crime?

                • FaceDeer@kbin.social
                  link
                  fedilink
                  arrow-up
                  4
                  arrow-down
                  1
                  ·
                  1 year ago

                  I would be, and I don’t understand why you think this would be a problem. I wouldn’t want the government to be preventing activities that there weren’t any actual laws prohibiting.

                  • sab@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    6
                    arrow-down
                    2
                    ·
                    1 year ago

                    You’re missing the point. I’ll make your example more specific.

                    Well when fraud/rape/murder happens we have laws. So no problems.

                    Those things happen. Creating a LLM based on copyrighted material without permission happens - it’s not a hypothetical. But even then, giving a punishment after the fact does not make the initial crime “no problem”, as you put it.