Random application segfaults on Arch - eviltoast

Hi everyone,

ever since I switched to Arch about two months ago, most applications segfault multiple times a day. There doesn’t seem to be any pattern for the crashes, sometimes it’s even happening while idling (e.g. reading a news article).

Things I’ve tried without any luck so far:

  • Running Firefox in safe-mode without any extensions
  • Switching from regular to LTS kernel
  • Disable Hardware Acceleration in Firefox
  • Change RAM speed and timings
  • Run Memtest successfully
  • Replace entire RAM with a new certified kit
  • Use only a single RAM slot
  • Apply Ryzen fixes (iommu=soft, limit c-states)
  • Use only a single CPU core (maxcpus=1)
  • Downgrade Nvidia driver to 535xx
  • Use Nouveau instead of the nvidia driver
  • Use Openbox instead of KDE
  • Disable zswap and THP

Here’s full journalctl from a day where both Spotify and Firefox crashed at the end, a few seconds after each other:

https://pastebin.com/BH0LMnD9

Some more info about my system:

  • Ryzen 5 3600X
  • MSI B450M PRO-VDH Max
  • 32GB RAM @ 3200MHz
  • Geforce RTX 2070 SUPER (using nvidia-dkms)
  • Plasma 5.27.10 on X11

I’m pretty sure that it’s not hardware related, because I’ve booted up a Debian 12 live image where everything ran for several hours without a crash. But it seems to be Arch related, as I also booted up a fresh EndeavourOS live image (so basically Arch), where applications also randomly segfaulted. Any idea why everything works fine on Debian but not on Arch? Debian uses the 6.1 kernel, which I already tried, so that’s not it.

Let me know if you need any more information that might help solve this issue. Thanks!

  • Michael Murphy (S76)@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    10 months ago

    Make sure you have the latest firmware for your motherboard. This sounds like unstable voltages for memory, or an overly-aggressive PBO curve. Did you try disabling the XMP profile on the RAM, disabling PBO, and upping the voltages (within safe limits) of the SOC, DDR, and VDDP? You might find some useful info here[0] or here[1] if you intend to run your memory at 3200 MHz.

    • NoisyFlake@lemm.eeOP
      link
      fedilink
      arrow-up
      1
      ·
      10 months ago

      Motherboard firmware is up-to-date, and I’ve already tried disabling XMP. I’ll give disabling PBO a try, thanks!

      I don’t necessarily have to run at 3200MHz, if it means that the system is finally stable. But since it’s already crashing at the default 2133MHz, I suppose there’s no use in playing with the voltages?

      • Michael Murphy (S76)@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        10 months ago

        It’s difficult to say for sure with certainty what the issue is without trial and error. I would expect that the motherboard’s manufacturer would make sure that their board can successfully pass all tests with the standard JEDEC spec for DDR4 (2133 MHz).

        Since you say that you’ve tried different RAM kits, another alternative could be the cleanliness of power from the power supply. Perhaps there is intermittent voltage droop, and you need to experiment with the Load Line Calibration settings to adjust for vdroop between idle and load. Disabling frequency boosting and manually setting the CPU frequency could help check if it’s related to that. PBO curves might be undervolting too much while idle.