[Question] Self hosted setup for monitoring Self-hosted services? - eviltoast

Hi all. I just set-up my first self-hosting server with NextCloud, Immich and a VPN server. I was wondering if there is a tool or layer of tools which would help me monitor my server and the services including running stats, resource usage stats, system logs, access logs, etc?

I read that Grafana Loki along with Prometheus could possibly help me with this. I just wanted to ask that - should I explore these two tools or do we have some other and better(suiting to my needs) tools? Please recommend Open Source tools only. Preferably Docker, or Linux based otherwise. Thank you :))

  • Maximilious@kbin.social
    link
    fedilink
    arrow-up
    8
    arrow-down
    1
    ·
    7 months ago

    Also -1 for netdata. I loved the analytics but it brought all of my VMs to a screeching halt. It did not seem very will optimized for the amount of data it was polling.

    • bellsDoSing@lemm.ee
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 months ago

      I went through setting up netdata for a sraging (in progression for a production) server not too long ago.

      The netdata docs were quite clear on that fact that the default configuration is a “showcase configuration”, not a “production ready configuration”!

      It’s really meant to show off all features to new users, who then can pick what they actually want. Great thing about disabling unimportant things is that one gets a lot more “history” for the same amount of storage need, cause there are simply less data points to track. Similar with adjusting the rate which it takes data points. For instance, going down from default 1s internal to 2s basically halfs the CPU requirement, even more so if one also disables the machine learning stuff.

      The one thing I have to admit though is that “optimizing netdata configs” really isn’t that quickly done. There’s just a lot of stuff it provides, lots of docs reading to be done until one roughly gets a feel for configuring it (i.e. knowing what all could be disabled and how much of a difference it actually makes). Of course, there’s always a potential need for optimizations later on when one sees the actual server load in prod.