My systems:
- Sunshine - file server and spillover host. Uses ZFS, provides GlusterFS and iSCSI
- Raspi - Raspberry pi 4 - network DHCP, DNS and control unit, spillover host
- Blizzard - Node, glusterfs 2nd server
- Toolbox / GamerStreamer - hybrid windows host with a linux node VM. Windows is used for running windows-only things and stream games via steam
- Cromie - chromebox converted to node
- AcerBox - An Acer box running as node
They're all running in a kubernetes cluster. Nodes are primary deployment targets, sunshine / raspi is set as not preferred, but will be deployed to if there's no other node with resources. Storage is done on glusterfs. Services are provided to network via metallb, and ssl cert handling is done via certbot. Ansible is used to set up and configure the cluster, making it pretty easy to add a new node.
In practice this means any one host can go down without services going down. It will take a 10-15 minute time for kubernetes to flag a node as down and not just rebooting or something and reschedule the services, but it's more or less self healing and usually already fixed before I notice it's been a problem.
As for services.. Some game servers, jellyfin, specialized stream servers for a project, nextcloud, postgres cluster, node red, grafana, influxdb, gotify, proget, a web server, and about 5-10 smaller personal projects.
I've noticed that sometimes it takes a long time to show up in results, and sometimes not show at all but worked 10-15 minutes later.
I think servers are overloaded atm