this post was submitted on 22 Apr 2024

54 points (100.0% liked)

Selfhosted

39435 readers

8 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago

MODERATORS

[email protected]

[Question] Self hosted setup for monitoring Self-hosted services? (lemmy.ml)

submitted 7 months ago by [email protected] to c/[email protected]

18 comments fedilink hide all child comments

Hi all. I just set-up my first self-hosting server with NextCloud, Immich and a VPN server. I was wondering if there is a tool or layer of tools which would help me monitor my server and the services including running stats, resource usage stats, system logs, access logs, etc?

I read that Grafana Loki along with Prometheus could possibly help me with this. I just wanted to ask that - should I explore these two tools or do we have some other and better(suiting to my needs) tools? Please recommend Open Source tools only. Preferably Docker, or Linux based otherwise. Thank you :))

top 18 comments

sorted by: hot top controversial new old

[–] [email protected] 16 points 7 months ago (2 children)

https://www.netdata.cloud/

[–] [email protected] 4 points 7 months ago (1 children)

+1 for Netdata

[–] [email protected] 9 points 7 months ago (2 children)

-1 for Netdata. I used it for a bit, but the configuration is not very intuitive and the docs for alerts were basically “rest of the fucking owl”, at least for the non-cloud version. I ended up just switching to Glances which is pretty boneless but it’s easy.

Though for OP I’d probably recommend Prometheus.

[–] [email protected] 7 points 7 months ago (1 children)

Also -1 for netdata. I loved the analytics but it brought all of my VMs to a screeching halt. It did not seem very will optimized for the amount of data it was polling.

[–] [email protected] 2 points 7 months ago

I went through setting up netdata for a sraging (in progression for a production) server not too long ago.

The netdata docs were quite clear on that fact that the default configuration is a "showcase configuration", not a "production ready configuration"!

It's really meant to show off all features to new users, who then can pick what they actually want. Great thing about disabling unimportant things is that one gets a lot more "history" for the same amount of storage need, cause there are simply less data points to track. Similar with adjusting the rate which it takes data points. For instance, going down from default 1s internal to 2s basically halfs the CPU requirement, even more so if one also disables the machine learning stuff.

The one thing I have to admit though is that "optimizing netdata configs" really isn't that quickly done. There's just a lot of stuff it provides, lots of docs reading to be done until one roughly gets a feel for configuring it (i.e. knowing what all could be disabled and how much of a difference it actually makes). Of course, there's always a potential need for optimizations later on when one sees the actual server load in prod.

[–] [email protected] 3 points 7 months ago* (last edited 7 months ago)

This isn't specific to just netdata, but I frequently find projects that have some feature provided via their cloud offering and then say "but you can also do it locally" and gesture vaguely at some half-written docs that don't really help.

It makes sense for them, since one of those is how they make money and the other is how they loose cloud customers, but it's still annoying.

Shoutout to healthcheck.io who seem to provide both nice cloud offerings and a fully-fledged server with good documentation.

[–] [email protected] 3 points 7 months ago

I'm a big fan of netdata; it's part of my standard deployment. I put in some custom configs depending on what services are running on what servers. If there's an issue it sends me an email and posts into a slack channel.

Next step is an influxdb backend to keep more history.

I also use monit to restart certain services in certain situations.

[–] [email protected] 12 points 7 months ago

I like Uptime Kuma, but it only monitors if a service is online or not. I'm up to 21 services now so I'm not interested in all their details, just if I need to fix something urgently.

[–] [email protected] 7 points 7 months ago

Grafana + Prometheus + data gathering will at least give you the resource and usage stats.

[–] [email protected] 5 points 7 months ago (1 children)

I’m using Zabbix to monitor the most important bits.

[–] [email protected] 1 points 7 months ago

There Russian affiliation makes me a bit nervous. It seems fine though

[–] [email protected] 4 points 7 months ago

I'd swap Prometheus for VoctoriaMetrics. It's a drop-in replacement with a much better resource consumption story and a few extra goodies.

[–] [email protected] 3 points 7 months ago

Grafana + Prometheus dashboards can be quite addicting or useful. Noted.lol put together a nice tutorial for getting started.

For most of my services though, I simply use Uptime Kuma which then sends an alert to Gotify when my services go down or whatnot, Gotify then instantly notifies my phone so I can be aware. It helps keep the spouse happy when their go to service for some reason crashed. :)

[–] [email protected] 2 points 7 months ago (1 children)

How much do you point and click your setup?

As I use nagios

[–] [email protected] 1 points 7 months ago (1 children)

I'm a web-app developer myself. So I don't mind configuring things if needed. I can opt to configure if it meets my goals better. I'd check out nagios. :))

[–] [email protected] 2 points 7 months ago

If you're willing to go that route, check out Zabbix and Icinga2 as well. They're compatible with Nagios checks but the user interface is better.

[–] [email protected] 2 points 7 months ago* (last edited 7 months ago) (1 children)

I've not found a good solution for actual constant monitoring and I'll be following this thread, but I have a similar/related item: I use healthcheck.io (specifically a self-hosted instance) to verify all my cron jobs (backups, syncs, ...) are working correctly. Often even more involved monitoring solutions do not cover that area (and it can be quite terrible if it goes wrong), so I think it'll be a good addition to most of these.

[–] [email protected] 3 points 7 months ago

A similar solution but I use https://ntfy.sh/ I have the app on my phone and have it set to alert when jobs ping the service. Mine ping on success but it is possible to ping when the job fails as well.