I currently have a Docker setup that I’m really happy with consisting of a NUC running minimal Ubuntu server. I only run 5 containers but kinda need them to be pretty reliable (i.e. my whole home becomes pretty annoying to use if HA has downtime):

  • HomeAssistant
  • ESPHome
  • MQTT
  • Scrypted
  • Static nginx instance

My desire for reliability is at odds with my stronger desire to avoid spending time on maintenance - I work in front of computers the last thing I want to do is fix my own IT woes! Therefore to avoid having to perform manual updates etc I have a small cron task that weekly:

  • Does a full unattended apt upgrade
  • runs “docker compose pull” and “docker compose up -d” for all containers.

This is all done with via a YOLO SLA approach with no continual backups and no rollback possibilities 🤦‍♂️

This is the bit that scares me - everything has been (surprisingly) fine for around 18 months but I am fully aware one bad update could really ruin my day especially with no downgrade path.

I was wondering if anyone could recommend a more appliance based system that I could use to essentially monitor, upgrade and manage both the host OS and containers. My googling isn’t turning up much unfortunately. Ideally I’d like features such as:

  • Docker compose support
  • Automated backups (preferably with S3 support)
  • Unattended container upgrades
  • Container health monitoring.
  • Rollback support if an upgrade goes bad
  • A nice web UI

I don’t care if this is software/hardware, free/paid (within reason) I just want something really simple that is designed for reliability and uptime.

Many thanks

  • ervwalter@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    I accomplish what you are looking for more or less, but it’s not an appliance–it’s a system of tools that I setup myself and maintain (which I enjoy). But it sounds like you want to avoid doing that.

    My solution includes:

    • A small Proxmox cluster so that if any single host dies, VMs can move to another host. This cluster approach is only necessary to protect against hardware failure–if that isn’t something you care to protect against initially, you can do all of the rest with only a single Proxmox host
    • On that proxmox cluster, a few VMs. I run these are VMs because that makes it super easy to snapshot each VM before making experimental changes (i.e. trivial rollback) and super easy to backup each VM to my NAS (again easy rollback for unplanned problems that get introduced without me noticing right away)
      • A VM for Home Assistant. I prefer Home Assistant OS to running it in a docker container myself. It is easier for me to manage this way.
      • A VM for just scrypted. Scrypted is easier to deploy if you can put it in network host most which could in theory interfere with other docker containers, so I keep it on an isolated VM. Extra VMs are easy with proxmox so there is little downside.
      • A VM running docker where most everything else runs. Docker containers are managed via portainer using docker-compose files
        • Docker compose files (called “stacks” in portainer) life in a private github and when I make changes it github, they portainer pulls them down and updates the running containers with the new compose file.
        • I run the Renovate bot on my github repo which notices when my containers are out of date and creates a Pull Request with a recommended upgrade. I can either manually approve those or create rules to auto merge them.
        • Because all the docker-compose files are in a git repo, rolling back after a problematic upgrade is usually trivial (unless the data got converted as part of an upgrade which might require restoring a VM backup in the worst case)
      • A VM running ubuntu I use for development (connected to remotely with Visual Studio Code). This is also the linux VM that I use to launch ansible playbooks to remotely do things like apt upgrades on the other VMs (HA excluded).
    • One of the containers I run is uptime-kuma which monitors general health of all my other services and notifies me via telegram and email if a VM or container dies or starts to look unhealthy.
    • Another container I run is homepage which is a dashboard that lets me get to all my services and also has widgets to surface more health information.

    This is not at all turnkey and took some time to put together, but I find it to be relatively low ongoing maintenance now that it is setup. And I have pretty good high availability and great rollback/recovery support in the event that something goes sideways with an upgrade or some configuration change I make manually.