It is common knowledge that pickle is a serious security risk. And yet, vulnerabilities involving that serialisation format keep happening. In the article I shortly describe the issue and appeal to people to stop using pickle.

  • Daniel Quinn@lemmy.ca
    link
    fedilink
    English
    arrow-up
    20
    ·
    edit-2
    12 days ago

    The thing is, none of the suggested alternatives can do what pickle does, and the article focuses on a narrow (albeit ubiquitous) use case: serialisation of untrusted data.

    There are still legitimate use cases for pickle, especially when storing, caching, or comparing objects that can’t easily be serialised with say, JSON or TOML. It’s a question of using the right thing for the right job is all, and pretending like JSON is a comparable alternative to pickle doesn’t help anyone.

    • mina86@lemmy.wtfOP
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      13 days ago

      If you’re serialising trusted data, you can define schema for it and use Protocol Buffers which will not only by safer but also faster. Pretending that you need to be able to serialise arbitrary data hurts everyone.

      • logging_strict@programming.dev
        link
        fedilink
        arrow-up
        2
        ·
        6 days ago

        Also there is strictyaml that validates against schemas. Don’t touch the builtin yaml module.

        protobuf needs to be compiled. This introduces possibility of coder error. Just forgetting to compile and commit protobuf files after a change. This affected the electrum btc and ltc (light) wallets.

        • mina86@lemmy.wtfOP
          link
          fedilink
          English
          arrow-up
          3
          ·
          6 days ago

          Also there is strictyaml that validates against schemas. Don’t touch the builtin yaml module.

          Thanks. I’ll include that in an update.

          protobuf needs to be compiled. This introduces possibility of coder error. Just forgetting to compile and commit protobuf files after a change. This affected the electrum btc and ltc (light) wallets.

          Yes, that’s certainly a downside. It also demonstrates one should not commit such generated files. A better approach is to commit the source files (in this instance message definition) and have a compilation step included in the program’s build/install recipe.

          strictyaml

          • logging_strict@programming.dev
            link
            fedilink
            arrow-up
            2
            ·
            4 days ago

            A better approach

            That unfortunately isn’t a better approach. The compilation step requires protobuf to be installed, by the distro package manager. To my knowledge it’s not available from pypi.

            An uncompiled protobuf file is essentially worthless unless it’s compiled. But if it’s compiled then it’s a binary blob.

            Not anti-protobuf. Just make the protobuf compiler available without getting a distro package manager involved.

            Otherwise slower alternatives might be more viable.

            strictyaml bundles strictyaml.ruamel, which used to be an external unmaintained C package.

            This reduces strictyaml dependencies to:

            pyproject.toml

            dependencies = [
                "python-dateutil>=2.6.0"
            ]
            

            Just that one. So can be confident strictyaml will work.

            Can the same be said for protobuf and Google (over invested in AI and is probably dying underneath a huge debt burden while spending tons of money on AI wash propaganda while not funding Python projects enough. Maintainer leave or burn out while everyone is too busy head fcking us with the AI washing to notice.)

            • mina86@lemmy.wtfOP
              link
              fedilink
              English
              arrow-up
              2
              ·
              4 days ago

              It is a better approach, it just may be more complex. Only people developing or packaging the library need to compile the message definitions. It’s not a big burden to require than they have protoc installed. The end user will only need to depend on the created package.

              • logging_strict@programming.dev
                link
                fedilink
                arrow-up
                1
                ·
                3 days ago

                It’s a potential single point of failure. Which have experienced first hand. The rest of the app could not run cuz a non-essential piece was non-operable due to the missing compiled message definitions file or message definitions file was updated but not compiled.

                So protobuf carries a non-zero risk.

                Could the app have been designed without an essential exploding binary blob? Most definitely yes!

                • mina86@lemmy.wtfOP
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  3 days ago

                  Writing software carries a non-zero risk. If compiling was part of building the package rather than manually committed to the repository, things would work. And that would make the design have no essential binary blob.

                  • logging_strict@programming.dev
                    link
                    fedilink
                    arrow-up
                    1
                    ·
                    8 hours ago

                    project cost = sigma(1...n)(risk likelihood of occurring * risk cost), but we aren’t discussing every possible risk. Only the one risk.

                    The risk of having to:

                    • for the app to work, requires compiled components
                    • having to be familiar with setup.py. This is referred to as the sewer, which is what is targeted by hackers e.g. xv
                    • maintainers who come later being familiar and can maintain packages that incorporate other languages e.g. C or rust
                    • possibly neglecting to perform the compile (but lets ignore this)
                    • compiler runs a binary written and maintained by the spy agency Google

                    or

                    Just not doing that

                    The only justification for going with protoc, over other methods, could only come down to data serialization speed. But in that case, wouldn’t a rust solution be: not only as fast, but also much safer.