A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data - eviltoast

I’m rather curious to see how the EU’s privacy laws are going to handle this.

(Original article is from Fortune, but Yahoo Finance doesn’t have a paywall)

  • CoderKat@lemm.ee
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    2
    ·
    1 year ago

    Retraining the model is incredibly expensive. That basically means not training the model with any user data, even if it slips in accidentally, by someone sabotage the training data, or even with consent (since consent can be revoked).

    • Thann@lemmy.ml
      link
      fedilink
      English
      arrow-up
      19
      arrow-down
      4
      ·
      1 year ago

      consent cant be revoked, theyre not even trying to get consent.

      They seemingly all have a “use first then ask for forgiveness” approach which should come around to bite them in the ass

      • Jaded@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        1
        ·
        1 year ago

        Anything else is going to bite US in the ass. Asking for consent kills any kind of open source development. It puts AI solely in the hands of like three companies. Our economy is going to be very AI focused in the future, they would literally own all of us.

        You aren’t getting paid either way so we might as well all enjoy the fruits of humanities labor freely instead of been forced into a subscription model of it.

        • Fushuan [he/him]@lemm.ee
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Asking for consent doesn’t kill open source development. Consent is the very reason we have licensed code. MIT, Apache, GPL3… And development is done and code is reused in accordance of those licenses.

          • Jaded@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            “Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation.”

            Yes crowd sourcing is a solution but is only really possible if you are able to reach many people like Mozilla can. They only have 20k of hours up to date. Tortoise needed 50k hours and was made by one guy who open sourced it. He would not have been able to build without scraping YouTube.

            Crowd sourcing also becomes much more complicated for llms or if you are making models in other language.