How do you backup important things you store in selfhosted clouds?

I’m currently thinking about hosting Ente myself for syncing all my pictures. Maybe also spinning up nextcloud for various other shared files. However, for me one main benefit of using services like iCloud is the mitigated risk of losing everything in case the hardware fails (and fire, theft, water-damages, …).

Do you keep regular updates on hosted services? Do you keep really important stuff on other providers? Do you have other failsafes?

  • CoyoteFacts@lemmy.ca
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    5 months ago

    I use a 48TB ZFS RAIDZ2 pool to maintain data integrity locally and keep rolling ZFS snapshots with sanoid so that data recovery to any point within the last month is possible. Then I use borgmatic (borg) to sync the important data (~1TB) to a Servarica VPS (Polar Bear plan, which works out to be cheaper than Backblaze B2 costs for my purposes). The Servarica server really sucks in terms of CPU, and it’s quite sluggish, but it’s enough for daily backups. I also self-host healthchecks.io on a free Fly.io VPS thing (not sure if they offer this anymore) to make sure the backups are actually happening successfully, and hosting that on a third-party VPS means that it’s not going to fail at the same time my server does. Then I use Uptime Kuma to make sure everything is consistently alive (especially the healthchecks.io server, which in turn verifies that Uptime Kuma stays alive). I also run the same borg configuration to back up to a plain non-redundant disk locally.

    The downside of this setup is that I’m only truly backing up a fraction of my pool, but most of my pool is stuff that I can redownload and set up again in the event of e.g. a house fire. I also run a daily script to dump a lot of metadata about my systems and pool, like directory listings of my media folders and installed programs/etc, which means that even if the data might be lost, I have a map of what I need to grab again.

      • CoyoteFacts@lemmy.ca
        link
        fedilink
        English
        arrow-up
        5
        ·
        5 months ago

        Snapshots basically put a pin in all the data at that moment and say “these blocks are not going to be physically deleted as long as I exist”, so the “additional” data use of the snapshots is equal to the data contained within the snapshot that doesn’t exist at the current moment. I.e., if I have two 50GB files, take a snapshot, and delete one, I will still have 100GB physical disk usage. I can also take 400 more snapshots and disk usage will remain at 100GB, as the snapshots are just virtual. Then I can either bring that deleted file back from the snapshot, or I can delete the snapshot itself and my disk usage will adjust to the “true” 50GB as the snapshot releases its hold on the blocks.

        What sanoid and other snapshot managers do is they repeatedly take snapshots at specified intervals and delete old snapshots past a certain date, which in practice results in a “rolling” system where you can access snapshots from e.g. every hour in the past month. Then once a snapshot becomes older than a month, sanoid will auto-delete it and free up the disk space that it’s holding onto. The exact settings are all configurable to what you’re comfortable with in terms of trading additional physical disk usage of potential “dead” data for the convenience of being able to resurrect that data for a certain amount of time.

        I really like the “data comet” visual from this Ars Technica article.