Server maintenance on Sunday - All done! - eviltoast

Hello everyone!

I’ll be working on migrating our image uploads from local files over to object store this Sunday starting at about 11am PT.

This may require extended lemmy.ca downtime as pict-rs has to be stopped while the migration happens, but I’m hoping I can keep things running as read-only and run the migration off a second instance.

Updates will be posted here - https://status.lemmy.ca/maintenance/257501

  • Illecors@lemmy.cafe
    link
    fedilink
    English
    arrow-up
    31
    ·
    1 year ago

    As someone who’s gone through that recently - brace yourself.

    I believe you can actually avoid downtime altogether if you’re willing to lose post thumbnails that would be generated during the migration. And even those might be possible to get regenerated, although I haven’t looked into it.

    Issue 1 - lemmy-ui will crash and burn if it can’t load the site logo. Specifically UI will simply say Server error and dev tools will show 500. Nginx will also only show 500 with no more detail and lemmy-ui will spam you something about having received an empty array. Apps will continue to work fine as they use API rather than the UI. Because I found out about it after having started the migration, I had to resort to setting site logo to NULL in the database. You might get away with just unsetting the logo in /admin, but prepare the query just in case. On mobile now, can’t give you the query itself.

    Issue 2 - pitcrs is not actually stateless. You MUST save that weird sled database and keep using it after the migration. Otherwise you’ll find out the Server error gremlins coming out more and more often.

    Issue 3 - pictrs migration code is a bit on the shit side. It tries to handle missing files (why are they missing in the first place?! I haven’t deleted anything ffs!), but eventually gives up and stops the migration. I had to restart it enough times to lose count. Luckily enough it does resume, but keeps retrying all missing files, not ever discarding them. Output becomes unreadable.

    Issue 4 - migration is dogshit slow. Took me nearly 4 hours to migrate ~20GB.

    There might be something else in your case as you’re running a much larger instance, but this is a prime example of how much alpha lemmy+pictrs really is.

    • Shadow@lemmy.caOPM
      link
      fedilink
      English
      arrow-up
      12
      ·
      1 year ago

      lemmy-ui will crash and burn if it can’t load the site logo

      lol yeah I discovered this one a while ago. Thanks for the tip on setting it to null.

      pitcrs is not actually stateless. You MUST save that weird sled database

      Yeah I was assuming so, didn’t think it was stateless.

      pictrs migration code is a bit on the shit side.

      Awesome.

      migration is dogshit slow.

      Even more awesome. I do know there have been performance improvements in recent patches, so if you did your migration awhile ago then we should hopefully be faster. We do have 350gb to migrate though.

      My current thought is:

      1. Block image uploads at the nginx level to make pictrs read-only-ish
      2. Clone the database for a second pictrs container
      3. Start the migration in that second container
      4. Leave the primary container up and serving images, but without any uploads.
      • Illecors@lemmy.cafe
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 year ago

        I did the migration last week. London DigitalOcean to Amsterdam Backblaze. Sticking to “stable” pictrs, so 0.4.2. I think the biggest issue for the migration is that it’s single threaded. If there’s a way to tell it to do many items at once - that would help a great deal. Might be worth looking into 0.5.x alpha releases.

        The problem I see with your plan is running pictrs during migration - you will still get thumbnails generated during that time. All this will lead to split brain. Not too big of a deal as you simply won’t have thumbnails for recent sites, but, technically speaking, you will still have data loss in terms of imaging. Might as well avoid the overhead of making a copy of the db.

        $0.02

        • Shadow@lemmy.caOPM
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 year ago

          The problem I see with your plan is running pictrs during migration - you will still get thumbnails generated during that time. All this will lead to split brain. Not too big of a deal as you simply won’t have thumbnails for recent sites, but, technically speaking, you will still have data loss in terms of imaging. Might as well avoid the overhead of making a copy of the db.

          Yep but I was thinking that losing some thumbnails was better than having image delivery completely off for hours…

    • Amends1782@lemmy.ca
      link
      fedilink
      English
      arrow-up
      9
      ·
      1 year ago

      I got a good laugh at this. What a fucking shit show my god. Absolute insanity lol. Good on you for sharing

    • baconisaveg@lemmy.ca
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      Saying any migration is going smoothly is like saying the Earth is round. It is, if you look at it from really far away :P

  • Shadow@lemmy.caOPM
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    1 year ago

    Migration is complete!

    Unfortunately it looks like pict-rs won’t generate thumbnails / images for posts that came in while things were moving, but otherwise all content should be there!