How do you all monitor your server performance?

Michaelscarn69-@alien.top · 1 year ago

How do you all monitor your server performance?

Theon@alien.top · 1 year ago

Netdata, I’ve meant to look into Grafana but it always seemed way too overcomplicated and heavy for my purposes. Maybe one day, though…

weller_rocks@alien.top · 1 year ago

I thought the same thing but it’s not bad actually, there are some pre build dashboards you can import for common metrics from Linux, windows, firewalls etc …

netdata is much better though (IMHO)

Mother_Construction2@alien.top · 1 year ago

I know that it needs a fix when my dad complaining that he can’t watch TV and the rolling door doesn’t open in the morning.

Dizzybro@alien.top · 1 year ago

The fastest way? Probably netdata

SadanielsVD@alien.top · 1 year ago

This. If you have more servers you can also get them all connected to a single UI where you can see all the Infos at once. With netdata cloud

Spaceman_Splff@alien.top · 1 year ago

Just set this up yesterday. I used a parent node and then have all my vms point to that. Took like an hour to figure it out

scotrod@alien.top · 1 year ago

Hey, did you use the cloud functionality or not? I’m tryna go all local with parent-child kind of capability but so far unable to.

Spaceman_Splff@alien.top · 1 year ago

The parent still is visible to the cloud portal. My understanding is the data all resides local, but when you login to their cloud portal, it connects to the parent to display the information. I’m still playing with it to confirm. My parent node shows all the child nodes on the local interface but the cloud still shows them all.

Spaceman_Splff@alien.top · 1 year ago

I don’t know if I’ll keep running this. Already the child nodes are complaining about increase write delays since installing the agents on them.

Michaelscarn69-@alien.top · 1 year ago

I’ll look into this too. Thank you.

weller_rocks@alien.top · 1 year ago

agreed … BY FAR the fastest. Easiest learning curve as well

HCharlesB@alien.top · 1 year ago

Checkmk (Raw - free version.) Some setup aspects are a bit annoying (wants to monitor every last ZFS dataset and takes too long to ‘ignore’ them one by one.) It does alert me to things that could cause issues, like the boot partition almost full. I run it in a Docker container on my (primarily) file server.

TheDeepTech@alien.top · 1 year ago

I use this as well! Works well and has built in intelligence for thresholds.

AstrologicalMob@alien.top · 1 year ago

I currently use thr classic “Hu seems slow, checks basic things like disk usage and process CPU/RAM usage I’ll do a reboot to fix it for now”.

dibu28@alien.top · 1 year ago

Windows Server? )

Nagashitw@alien.top · 1 year ago

This is me. Can’t hurt to just do a reboot

ElevenNotes@alien.top · 1 year ago

Netdata, monitoring a few thousand servers (virtual) that way.

LNDN91@alien.top · 1 year ago

Rainmeter if it’s directly on their desktop/background.

how_now_brown_cow@alien.top · 1 year ago

TICK stack is the only answer

opensrcdev@alien.top · 1 year ago

InfluxDB metrics server and Telegraf agent to collect metrics

Pesfreak92@alien.top · 1 year ago

Uptime Kuma and Grafana. Uptime Kuna to monitor if a service is up and running and Grafana to monitor the host like CPU, RAM, SSD usage etc.

Reasonable-Ladder300@alien.top · 1 year ago

Same here, also have some autoscaling mechanisms set up in docker swarm to scale certain services in case the load is high

Michaelscarn69-@alien.top · 1 year ago

Thank you for this. I appreciate the support.

bobbarker4444@alien.top · 1 year ago

I just check the proxmox dashboard every now and then. Honestly if everything is working I’m not too worried about exact ram levels at any given moment

BloodyIron@alien.top · 1 year ago

libreNMS is the tool I use, and it connects to systems primarily via SNMP (use v3, do not use v1 or v2c).

gold76@alien.top · 1 year ago

Influx/telegraf/grafana stack. I have all 3 on one server and then I put just telegraf on the others to send data into influx. Works great for monitoring things like usage. You can also bring in sysstat.

I have some custom apps as well where each time they run I record the execution time and peak memory in a database. This lets me go back over time and see where something improved or got worse. I can get a time stamp and go look at gitea commits to see what I was messing with.

Charming-Molasses-22@alien.top · 1 year ago

I don’t check it all the time like a maniac but I have a glances docker running on my main server.

opensrcdev@alien.top · 1 year ago

Glances is really nice. I’ve been using btop more recently though.

Majestic-Contract-42@alien.top · 1 year ago

If one of my users ever complained about anything I would possibly look into it, otherwise it all works so I don’t waste life energy on that.