Google updates privacy policy to train its AI on everything you post online - eviltoast

TL;DR

  • Google has updated its privacy policy.
  • The new policy adds that Google can use publically available data to train its AI products.
  • The way the policy is worded, it sounds as if the company is reserving the right to harvest and use data posted anywhere on the web.

You probably didn’t notice, but Google quietly updated its privacy policy over the weekend. While the wording of the policy is only slightly different from before, the change is enough to be concerning.

As discovered by Gizmodo, Google has updated its privacy policy. While there’s nothing particularly notable in most of the policy, one section now sticks out — the research and development section. That section explains how Google can use your information and now reads as:

Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.

Before the update, this section mentioned “for language models” instead of “AI models.” It also only mentioned Google Translate, where it now adds Bard and Cloud AI.

As the outlet points out, this is a peculiar clause for a company to add. The reason why it’s peculiar is that the way it’s worded makes it sound as if the tech giant reserves the right to harvest and use data from any part of the public internet. Usually, a policy such as this only discusses how the company will use data posted on its own services.

While most people likely realize that whatever they put online will be publicly available, this development opens up a new twist — use. It’s not just about others being able to see what you write online, but also about how that data will be used.

Bard, ChatGPT, Bing Chat, and other AI models that provide real-time information work by scraping information from the internet. The sourced information can often come from others’ intellectual property. Right now, there are lawsuits accusing these AI tools of theft, and there are likely to be more to come down the line.

  • Slacking@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    1 year ago

    The internet belongs to everyone, google should have as much right to it as us. I use public data when I make a fine tune of an llm or a stable diffusion lora.

    The only ones to benefit from restricting data access are the big companies, because they already have it all. Don’t fall into the trap of advocating for a closed copyrighted internet, it will only hurt the little guys and literally no one else.

  • sabreW4K3@lemmy.tf
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    1 year ago

    The only person that’s happy about this is Elon, because it vindicates him. Everyone else should be outraged. Honestly, Google needs to start paying us, because this absolutely isn’t right and they will profit massively.

    • 𝒍𝒆𝒎𝒂𝒏𝒏@lemmy.one
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 year ago

      I say we start poisoning things with AI generated text. I’ll be doing so myself on my blog, only a couple of sentences here and there as to not detract from the quality or put readers off.

    • cyberpunk007@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      2
      ·
      1 year ago

      You use all their products for free, that is the condition. Quid pro quo. If you don’t like it, just stop using their services. I’ve been using duckduckgo for years. I do use gsuite for my email, but because I get it free. If I didn’t I’d move to proton. If everyone stopped using Google they’d be forced to improve things for their users, but they’re a conglomerate and can pretty much get away with anything.

      • sabreW4K3@lemmy.tf
        link
        fedilink
        English
        arrow-up
        6
        ·
        1 year ago

        I don’t use Google search. I’m forced to train their AI for captchas. They’re already using content I produce to sell and display adverts.

        I understand that some people are like house slaves, hand out ready to scream, “yes massa!” but I know the value of my data that Google get and use and they’ve already gone way too far.

  • clementineholic@lemm.ee
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 year ago

    I don’t like this at all, but I doubt it can be stopped. I hope at the end of the day, when AI adoption is widespread, AI will have improved the internet and our lives rather than make them worse.

    • SuddenDownpour@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I hope at the end of the day, when AI adoption is widespread, AI will have improved the internet and our lives rather than make them worse.

      That will depend on who owns the tech, who it is sold to, for which purposes, and what kind of regulation controls it. With the way things are right now, it will probably be used to manipulate public opinion on comments sections to the point that the so-called “bots” will actually truly be bots, rather than boosted messages written by humans.

      • clementineholic@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Yeah that is a likely outcome. I was just trying to stay positive and hope for the best. When I think about how bad things can get with the misuse of AI, it makes me kinda depressed.

  • jerb@lemmy.croc.pw
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Are we sure this isn’t just for clarity? “Language model” implies Bard and such already as they’re more formally called “large language models.” While I don’t like that they’re doing it, I think it’s very likely they’ve been publicly scraping information for quite some time (in fact, for an LLM like Bard, they pretty much have to!), and have just changed the wording to fully disambiguate between Google Translate and Bard.

    • VinkTheGod@lemdro.id
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      1 year ago

      Is it a problem really? We post because we want to, usually at our own leisure. If a site is public, it means everyone can see what we had posted. Instead of a human it’ll be a bot that remembers bits and pieces.

      In the future AI will be heavily regulated most likely. Right now it’s a wild west. Big corpos have resources, so they do it to get the lead. It has always been like that, why this instance is different?

      • Reclipse@lemdro.id
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        1 year ago

        Google can train their AI with information I post on google sites. But to use information posted on other sites seems problematic to me.

        • cyberpunk007@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          1 year ago

          You put your info on other sites, which permit google to scrape them of info. Just how it’s always worked. The only way to avoid it is to abstain from putting content online, especially where google is known to collect information.