Seeking feedback: how should lemm.ee move forward with external images? (related to frequent broken images) - eviltoast

Hey folks!

I am looking for feedback from active lemm.ee users on what you all value when it comes to images on Lemmy. I’ll go into a bit of detail about what our options are, and then I would ask you to voice your opinion about the issue in the comments.

First, some context for those who don’t know. Lemmy software can be configured to handle images in three different ways:

  1. Store images locally - whenever an external image is posted somewhere, lemm.ee will download a permanent local copy. When you view posts, you are seeing our local copy of the image.
  2. Proxy all images - similarly to the first option, lemm.ee will download a local copy of external images, however, this copy is temporary. It will be automatically deleted shortly after, and if users open the relevant post/comment again in the future, there will be another attempt to download a temporary copy at that point.
  3. Pass through external images directly - lemm.ee never downloads any external images, users will always connect directly to the source servers to load the images.

There are pros and cons to each configuration.

Storing images locally

Benefits:

  1. Your IP address is never leaked to external image hosts, as you never connect directly to the source server. External image hosts only see the IP address of the lemm.ee server.
  2. External servers don’t become bottlenecks for opening lemm.ee posts. If an external server is slow, it won’t matter, because the image is always available locally

Downsides:

  1. As time goes on, our storage will fill up with hundreds of gigabytes of useless images, most of which will never be viewed again after the relevant posts fall off the front page.
  2. Many big external image hosts will rate limit bigger Lemmy servers, causing broken images when we fail to make a local copy.
  3. Crucially: some people love to spend their time uploading illegal content to online servers. There are tools to try and filter out such content, but these are not perfect. The end result is that there is a high chance of some content like this inadvertently reaching lemm.ee storage and staying there permanently. This downside is why lemm.ee has not, and will not, use this particular configuration.

Proxying images

Benefits: In addition to the same benefits as exist for the permanent local storage, by only temporarily making local copies for the moment they are requested by our users, we free up a ton of storage & remove the risk of permanently storing illegal content on our servers.

Downsides: The key downside is that external rate limits hit us much harder, as we will be requesting external images far more often. This results in a lot of constant broken images on lemm.ee.

Passing through external images

Benefits:

  1. Images are rarely broken, unless the source server goes down.
  2. The images never touch our servers, removing a lot of risk with illegal content as well as with storage costs.

Downsides:

  1. Our users lose a degree of privacy. Every external image that is loaded on your browser will result in the remote server getting a request directly from your computer to fetch that image - this is pretty much the same as you had visited that external server directly, which lets them log your IP address if they wish.
  2. When remote servers are slow, it can slow down the entire page load in some cases.

Current situation

Initially, lemm.ee was using the third option of passing through images. Ever since support for option 2, image proxying, was implemented in Lemmy code, we immediately switched to that option, mainly for the privacy benefits. However, after many months, and being blocked by more and more external servers, it is clear that image proxying is seriously degrading the user experience on lemm.ee. We often end up with broken images, and our users have to deal with the results.

I still believe image proxying is a really valuable feature, but I am starting to believe it is a better fit for small instances which make much less requests to external servers.

As a result, I am now seriously considering switching back to the previous method of passing through external images.

This is where you come in - I would ask you as users to please let me know which do you value more: the privacy that you get from image proxying, or the better user experience you get from directly passing through images from their source. Please let me know in the comments how you feel. If I get enough feedback about people being against image proxying, then I will be switching it off for lemm.ee soon. Thanks for reading & sharing your thoughs, and I hope you have a great weekend!

  • shackled@lemm.ee
    link
    fedilink
    arrow-up
    9
    ·
    16 days ago

    Option 3 is the only one that seems sustainable long term. Donations will NEVER keep up with user growth, thus storage costs will balloon out of control.

    Completely avoiding any chance of illegal content touching the servers should immediately have everyone agreeing on this option. I doubt anyone here is willing to foot legal bills and as such even minor legal actions would be the end of this instance.

    Privacy is nice but ip logging is the simplest form to “protect” against with even a free VPN. If those claiming privacy concerns here aren’t already using a VPN and are depending purely on lemme.ee’s proxy then their internet hygiene needs an update.

    As for usability, the image being deleted from external provider presents the same issue to the user between option 2 and 3. The cache from option 2 will inventually get cleared and it’ll fail to pull a fresh copy if deleted from the external hosts.

  • flashgnash@lemm.ee
    link
    fedilink
    arrow-up
    6
    ·
    edit-2
    16 days ago

    I believe just passing through external images is the way to go, it’s always the one I opt for if I can

    Hosting those images is gonna get expensive and that kinda sucks for a donation run platform when that money could be far better spent elsewhere

    I think also using external image sources is more in line with the idea of decentralisation, Lemmy isn’t an image host it’s a link aggregator and forum - I believe most image hosting sites will be far better at loading images quickly than Lemmy’s implementation could ever be

  • uiiiq@lemm.ee
    link
    fedilink
    arrow-up
    4
    ·
    16 days ago

    Although I value privacy, I value sustainability more. Choose an option you feel most comfortable with, option which puts the least burden on your shoulders.

  • thefartographer@lemm.ee
    link
    fedilink
    arrow-up
    3
    ·
    17 days ago

    I say option 3. Sure, it’s annoying, but that’s our problem. Your problem is keeping the server operational, safe, and low-cost.

    Considering the vote:content ratio, it looks like most users spend most of their time spectating. The spectators feed our egos and determine the trends in content by singular positive or negative votes. The spectator experience seems far more important to me and it should be the onus of the contributors to ensure their own privacy, just like they do in avoiding doxxing themselves via text.

    Option 2 certainly follows in the vein of improved performance, but if real-world implementation is proving too unstable or creating too much overhead, then I say “fuck it, option 3 sounds great.”

  • jaschen@lemm.ee
    link
    fedilink
    arrow-up
    3
    ·
    16 days ago

    I am also in favor of option 3. Hosting images has a chance of hosting some csam and that might take you out.

  • invertedspear@lemm.ee
    link
    fedilink
    arrow-up
    3
    ·
    17 days ago

    While I appreciate you trying to make it so, my privacy isn’t your responsibility. Option 3 is the way to go to keep your costs down, which is the long-term best solution.

    It wasn’t one of the options, but from a user perspective a hybrid solution is best. Making a local copy that has a 24-48 hour cache keeps your storage down but still gives us the benefit of option 1. But it sounds like this would require some changes to how the server software works, which would be cool, but again you shouldn’t feel compelled to attempt.

    Keep your costs and stress low wherever possible and thank you for everything you do.

  • Charlatan@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    15 days ago

    I am very privacy oriented, and I am fine with option 3. Thanks for all you do!

  • /home/pineapplelover@lemm.ee
    link
    fedilink
    arrow-up
    2
    ·
    16 days ago

    I’m a fan of option 3. For my past few image posts I’ve been using other image hosting sites like imgchest because my images won’t upload.

  • Sotuanduso@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    17 days ago

    I’d say option 3. Personally, I don’t care if random websites get my IP among a list of hundreds of others, and if someone wants to keep their IP hidden from strangers, they should be using a VPN before browsing the net anyways. It’d also be nice not to have to open another instance when I come to a post with a broken image that I want to see, but that’s not hugely important to me.

    If it were an instance specifically for privacy enthusiasts, that’d be a different story, but this is a general-purpose instance, and option 3 seems to be what’s best for both general users and the server itself.

    • Einar@lemm.ee
      link
      fedilink
      arrow-up
      1
      ·
      16 days ago

      Im trying to avoid saying “this”. Still, your post reflects my thoughts exactly.

      3 it is.

  • unhappy.termite@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    16 days ago

    Option 3.

    Privacy concerns aside, which I am willing to bear, Option 3 is the most sustainable option.

    Looking at our status page, the projected monthly expenses is greater than the revenues. If passing through external images allows us to reduce operational costs and ensure lemm.ee’s sustainability despite the loss of a degree of privacy, then it’s a tradeoff I’m willing to make.

    Thanks, admins and mods, for everything that you do!

  • CRUMBGRABBER@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    16 days ago

    I like 3 also. if you can’t get a good user experience privacy matters less because there are no users, and anyone can spin up a super privacy enhanced instance if they want.

    I was thinking however that a super stand alone image server would be a great thing for Lemmy, and pixelfed and the Fediverse in general.

    Imgur blew up and was the go to for image hosting on Reddit for years, until Reddit realized it was leaking traffic and users to them and started their own. But hosting images has a lot of potential headaches like copyright violations and big corps suing you into oblivion, in addition to inadvertently hosting illegal stuff. The Fediverse will need some good image hosting servers and video hosting servers as part of the plan in the long run though.

  • RagingHungryPanda@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    17 days ago

    I would do option 3, both as a user and if I were to make my own service. I’d let a dedicated image service do it. I’m not worried about IP leakage and if I am I can use a VPN. If I want to be that concerned about it, I think the responsibility is on me, not on the service unless they promise that level of privacy.

    I vote for what gives the best experience for users and makes running the service easier and cost effectivem

  • Jabbl@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    17 days ago

    I seem to agree with most other users in this thread.

    Sure, the privacy aspect is nice; most people already share too much information with third parties unwillingly, but I think a good user experience should be prioritised. If option 3 is more likely to provide that, I would choose going that way.

    Most people, especially those migrating from other sites, probably care more about the images loading than the privacy involved with proxying them.

    It would maybe be nice to have an option to not load external resources automatically, or a black/whitelist for certain sites, if such an option doesn’t already exist.