OpenAI strikes Reddit deal to train its AI on your posts - eviltoast
  • gila@lemm.ee
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    6 months ago

    Well, that’d be the mechanism of how GDPR protections are actioned, yes; but leaving themselves open to these ramifications broadly would be risky. I don’t think it’d satisfy ‘compliance’ to ignore GDPR except upon request. Perhaps the issues with it are even more significant when using it as training data, given they’re investing compute and potentially needing to re-train down the track.

    Based on my understanding; de-identifying the dataset wouldn’t be sufficient to be in compliance. That’s actually how it worked prior to it for the most part, but I know companies largely ended up just re-identifying data by cross-referencing multiple de-identified datasets. That nullification forming part of the basis for GDPR protections being as comprehensive as they are.

    There’d almost certainly be actors who previously deleted their content that later seek to verify whether it was later used to train any public AI.

    Definitely fair to say I’m making some assumptions, but essentially I think at a certain point trying to use user-deleted content as a value add just becomes riskier than it’s worth for a public company