Reddit sells training data to unnamed AI company ahead of IPO | Ars Technica - eviltoast
  • Arcane_Trixster@lemm.ee
    link
    fedilink
    English
    arrow-up
    15
    ·
    9 months ago

    Incel-A.I. about to hit the market. Get ready to have automated Nice Guy rants sent to your inbox.

  • Rayspekt@kbin.social
    link
    fedilink
    arrow-up
    10
    ·
    9 months ago

    I altered my whole history on Reddit with Redact when they announced the 3rd party API thing, so have fun training on my nonsensical comments.

    P. S.: Yeah I know they may have older data where my stuff is still intact, but changing it didn’t cost me a thing and maybe it worked. It’s about the message after all.

  • sramder@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    9 months ago

    Well I always felt slightly guilty for not participating enough… so I’m guilt-free now! Look at me contributing to the future of Ai :-S

  • Ziggurat@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    edit-2
    9 months ago

    No shit sherlock, reddit terms of service allows them to use the content you create commercially so it’s surprising it hasn’t been done earlier :)

    Considering the whole reddit subculture of meme, private joke, and weird slangm and conversation full of Dunning Kruger, comm’on we’re all guilty of that . I am not a language model training on reddit would be more accurate than chatGPT.

    I am not really sure on which licence is the content we publish on the fediverse, but considering the level of shitpost, and the leftist tendency on the fedi. I am curious about what an AI trained here would say eat the rich, Blahaj is so cool !!!, Micro$oft is evil use GNU/Linux, Prolatariate dictatoraship is overrated we want a catgirl dictatorship Sound like it could be an awesome AI, can’t wait so see business consultants using it :)

    • Preventer79@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      9 months ago

      I am not a language model training on reddit would be more accurate than chatGPT.

      ChatGPT and most other AI apps are trained on Reddit and have been since they’ve been publicly released. I remember TalkToTransformer in 2019 would say Redditisms like “fellow redditor” or talk about being in a sub to every other prompt.

      Wasn’t exclusivity for AI access the reason for the API change in the first place?

    • Rooki@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      3
      ·
      edit-2
      9 months ago

      They probably sold your stuff already before but didnt told you yet.

      On the fediverse everything you say is open in the public. And if a real company (that doesnt do this without consent) wants to train with fediverse data, they would be expunged asap from every instance. And probably lawsuits incomming. Because the fediverse builds upon trust (that nothing bad is done to your data), and if its broken hell let loose on them.

      Linux is just superior XD, joking aside everyone should use their favorite OS.

      • a4ng3l@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        9 months ago

        How would you expunge such an actor though? Can’t literally anyone setup an instance and get a stream of content by design? Without the shady instance participating / posting data you’d never know would you?

        • Rooki@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          9 months ago

          I would say, if a company would do that legit, they would require a lot of agreements with other instances. If they do it under the “radar” like a sleeper instance, then we dont know and we will never know.

            • a4ng3l@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              ·
              9 months ago

              Yeah I was going to point that there’s nothing to my knowledge required… no t&c or anything… not even data processing agreement which is annoying imho.