OpenAI finally admitted they're crawling the web to profit off of GPT. Block it from your sites using robots.txt. - eviltoast
  • RedstoneValley@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    Yeah, you’re right that it is different from simply stealing content. However the LLMs still use protected material as input and it seems that at least parts of those works can be uniquely identified in the output. That can be considered problematic, even if the data is deconstructed into embeddings inbetween input and output.