How are pseudo/true random numbers generated mathetmatically, what sorcery is this? - eviltoast
  • NOT_RICK@lemmy.world
    link
    fedilink
    English
    arrow-up
    31
    ·
    3 months ago

    To collect this data, Cloudflare has arranged about 100 lava lamps on one of the walls in the lobby of the Cloudflare headquarters and mounted a camera pointing at the lamps. The camera takes photos of the lamps at regular intervals and sends the images to Cloudflare servers. All digital images are really stored by computers as a series of numbers, with each pixel having its own numerical value, and so each image becomes a string of totally random numbers that the Cloudflare servers can then use as a starting point for creating secure encryption keys.

    • RattlerSix@lemmy.world
      link
      fedilink
      arrow-up
      8
      ·
      edit-2
      3 months ago

      I’m sure they’re smarter than me but my novice cryptographer brain doesn’t understand this.

      Isn’t the string of numbers representing each pixel rather limited Aren’t all those sections of the image that have lava lamps limited to values somewhere in the reddish/blueish spectrum? Isn’t the gray background very non-random?

      Apparently the camera is accessible in the lobby. They say people walking in front of the camera adds randomness. If I go there and hold a photo in front of the camera which I know the values of, doesn’t that compromise everything?

      To be fair their site says they take the lava lamp output and combine it with entropy from two servers and probably do a lot of other stuff before actually getting random numbers. But I don’t get how the lava lamps photo setup is even close to being random

      tried and failed to add image, so here a link: https://www.cloudflare.com/learning/ssl/lava-lamp-encryption/

      • barsquid@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        3 months ago

        If you hash the image with a strong algo, even a single different pixel should end up in a wildly different result.

        • RattlerSix@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          3 months ago

          But different doesn’t mean random. An attacker could test every possibility for that pixel rather quickly. Even faster if you know a bunch of the pixels are, for example, a shade of gray.

          I found an explanation where they bring up the issues I brought up as well as some others.

          https://blog.cloudflare.com/lavarand-in-production-the-nitty-gritty-technical-details/

          They give an example with one red value for a single pixel. They don’t address my point that there are a lot of pixels like that one, maybe not all would be 50/50 like their example but a lot of pixels would have a much narrower range of values than randomness.

          But they answer my second point about a hacker putting something in front of the camera with known values and that answer sort of takes care of everything. It boils down to, it doesn’t matter because the lava lamp wall output is mixed with other sources that an attacker doesn’t have access to.

    • wise_pancake@lemmy.ca
      link
      fedilink
      arrow-up
      4
      arrow-down
      1
      ·
      3 months ago

      Surely that can’t be uniform random though

      But they’re just using it for a seed, so the output would be impossible to predict, but it feels like a checksum or something would approach a Gaussian distribution (the more numbers you add up, the more Gaussian it would be, since we know an image will have a mean and finite variance).

      • palordrolap@fedia.io
        link
        fedilink
        arrow-up
        10
        ·
        3 months ago

        There are ways to get entropy out of non-uniform data in order to approach if not reach a uniform distribution.

        A naïve, but surprisingly effective way to do this would be to put the data through a hashing algorithm of some sort.

        Good hashing algorithms are specifically designed to make similar but non-identical inputs hash to values that appear unrelated.

        Depending on the data source, there may be more efficient ways of getting an unpredictable sequence of bits out of it. e.g. for image data, an image difference from an average image may be more appealing than using the plain image, but I’m not sure whether that’s legitimately “more random” or whether it just feels that way.