Reddit removes years of chat and message archives from users' accounts - eviltoast
        • igorlogius@lemmy.world
          link
          fedilink
          English
          arrow-up
          9
          ·
          edit-2
          1 year ago

          And they sell it to the highest bidder to train their next LLM, which seems to be all the rage at the moment.

      • thisbenzingring@wirebase.org
        link
        fedilink
        English
        arrow-up
        14
        ·
        edit-2
        1 year ago

        Text data on a compressed drive is so small. You have a modern server and accessing text files in a compressed drive is not noticeable performance hit. The compression ratio is massive for text and markup language files

        • thepianistfroggollum@lemmynsfw.com
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          4
          ·
          1 year ago

          Yes, text doesn’t take up much space, but decades of text can easily take up a lot of space, especially when you track things like edits.

          Not to mention that this data isn’t in text files. It’s going to be in a database, so the number of records that need to be parsed will impact performance. How big that impact is depends on how they set the database up.