Fediverse Disaster Recovery - eviltoast

Faulty peripheral power supply killed my server a little over a day ago.
120 gigs of MySQL data just wouldn’t come up - backup is far from recent. My fault. Most corrupted tables were of course in Friendica.
After much nail chewing everything now appears operational again with minimum(?) data loss.

In other words: can you all read me? ;-)

  • Hexarei@programming.dev
    link
    fedilink
    English
    arrow-up
    16
    ·
    1 year ago

    Holy cow, 120 gigs in a database?

    Also remember, you don’t want a backup solution, you want a restoration solution :-)

    • pete@social.cyano.atOP
      link
      fedilink
      arrow-up
      10
      ·
      1 year ago

      Hear hear! You don’t own a backup if you’ve never restored it before. Words to live by both in corporate and self-hosting environments.

    • pete@social.cyano.atOP
      link
      fedilink
      arrow-up
      3
      arrow-down
      1
      ·
      1 year ago

      Ironically, if I would have had more services running in docker I might not have experienced such a fundamental outage. Since docker services usually tend to spin up their exclusive database engine you kind of “roll the dice” as far as data corruption goes with each docker service individually. Thing is, I don’t really believe in bleeding CPU computation cycles by running redundant database services. And since many of my services are already very long-serving they’ve been set up from source and all funneled towards a single, central and busy database server - thus, if that one experiences sudden outage (for instance power failure) all kinds of corruption and despair can arise. ;-)

      Guess I should really look into a small UPS and automated shutdown. On top of better backup management of course! Always the backups.

      • TCB13@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        1
        ·
        edit-2
        1 year ago

        Why so much? A simple daily timer that runs mysqlcheck + mysqldump + a backup of that would be enough for most people. Using a solid OS (Debian) and a filesystem such as BTRFS, ZFS or XFS will also save you from power loss related corruption. Why do people go SO overkill with everything?

        Keep it simple, less services, less processes, less overhead, pick well written software and script the rest. Everything works out way better if you don’t overcomplicate things.

        • pete@social.cyano.atOP
          link
          fedilink
          arrow-up
          3
          ·
          1 year ago

          at least weekly mysqlcheck + mysqlddump and some form of periodic off-machine storing of that is something I’ll surely take to heart after this lil’ fiasco ;-) sound advice, thank you!

      • ThorrJo@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        Personally I’d go for as big a UPS as I could afford, but I serve some public-facing stuff from my homelab and I live in an area with outdated infrastructure and occasional ice storms. I currently have a small UPS and have been too tired/overwhelmed to set up automated shutdown yet. It’s not too hard though, I’ve done it before. And even without that in place, my small UPS has kept things going thru a bunch of <10 minute outages.