LLMs are surprisingly great at compressing images and audio, DeepMind researchers find - eviltoast
  • YellowBendyBoy@lemmy.world
    link
    fedilink
    arrow-up
    20
    ·
    1 year ago

    It probably is more like the LLM is able to „pack the truck much more efficiently“ and decompression should be the same.

    But I agree that the likely use-case of uploading all your files to the cloud, having it compress your files, and downloading the result which is a few kb smaller isn’t really practical time efficient or even needed at all.

    • DarkenLM@kbin.social
      link
      fedilink
      arrow-up
      11
      ·
      1 year ago

      Correct me if I’m wrong, but don’t algorithms like Huffman or even Shannon-Fano code with blocks already pack the files as efficiently as possible? It’s impossible to compress a file beyond it’s entropy, and those algorithms get pretty damn close to it.

        • enkers@sh.itjust.works
          link
          fedilink
          arrow-up
          15
          ·
          1 year ago

          That was my first thought as well, but it doesn’t seem to be the case:

          In their study, the Google DeepMind researchers repurposed open-source LLMs to perform arithmetic coding, a type of lossless compression algorithm.

      • zero_iq@lemm.ee
        link
        fedilink
        arrow-up
        5
        ·
        edit-2
        1 year ago

        Correct me if I’m wrong

        Well actually, yes, I’m sorry to have to tell you are wrong. Shannon-Fano coding is suboptimal for prefix codes and Huffman coding, while optimal for prefix-based coding, is not necessarily the most efficient compression method for any given data (and often isn’t).

        Huffman can be optimal given certain strict constraints, but those constraints don’t always occur in natural/real- world data.

        The best compression method (whether lossless or lossy) depends greatly on the nature of the data to be compressed. Patterns and biases can make certain methods much more efficient (or more practical) in some cases, when they might be useless elsewhere or in general. This is why data is often transformed before compression, using a reversible transformation that “encourages” certain desirable statistical characteristics in the data, so the compression method can better exploit them.

        For example, compression software (e.g. gzip) may perform a Burrows-Wheeler transform and other encodings before applying Huffman coding to get a better compression ratio. If Huffman coding was an optimal compression method for all possible data, this would be redundant! Often, E.g. in medical imaging, audio/video data, the data is best analysed in a different domain to better reveal the underlying patterns and redundancies in the data so they cam be easily exploited by compression. E.g. frequency domain instead of time/spatial domain.

        • DarkenLM@kbin.social
          link
          fedilink
          arrow-up
          3
          ·
          1 year ago

          No need to be sorry, I am well aware I can be wrong, and I prefer to learn something new than being bashed for being wrong.

          Maybe I phrased it in a way different than I thought about it. I didn’t mean to claim that Shannon-Fano or Huffman are THE most efficient ways of doing it, but rather that comparing it to the massive overhead of running a LLM to compress a file, the current methods are way more resource efficient, even one as obsolete as Shannon-Fano codes.

          I should probably have mentioned an algorithm like LZMA, or gzip, like you did.