Make illegally trained LLMs public domain as punishment - eviltoast

It’s all made from our data, anyway, so it should be ours to use as we want

  • Superb@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    9 hours ago

    No, the model does retain the original works in a lossy compression. This is evidenced by the fact that you can get a model to reproduce sections of its training data

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      3
      ·
      8 hours ago

      You’re probably thinking of situations where overfitting occurred. Those situations are rare, and are considered to be errors in training. Much effort has been put into eliminating that from modern AI training, and it has been successfully done by all the major players.

      This is an old no-longer-applicable objection, along the lines of “AI can’t do fingers right”. And even at the time, it was only very specific bits of training data that got inadvertently overfit, not all of it. You couldn’t retrieve arbitrary examples of training data.