What storage software could I run to have an archive of my personal files (a couple TB of photos) that doesn’t require I keep a full local copy of all the data? I like the idea of a simple and focused tool like Syncthing, but they seem to be angling towards replication.

Is the simple choice to run some S3-like backend and use CLI or other client to append and browse files? I’d love something with fault tolerance that someone can gradually add disks to. If ceph were either less complicated or used less resources I’d want to do that.

  • lemmyvore@feddit.nl
    link
    fedilink
    English
    arrow-up
    16
    ·
    9 months ago

    Borg Backup. It can work locally or over network. Takes snapshots of the files you give it. Performs deduplication, compression and optionally encryption. You can check the integrity of the backups and repair them. There’s a very simple to use GUI for it called Pika Backup to get you started.

  • deegeese@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    8
    ·
    9 months ago

    Are we talking personal offsite backup, or a commercial cloud service?

    For cloud backups I like BackBlaze but I’ve never tried to use it as a general cloud storage drive.

    • jkrtn@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 months ago

      This would be self-hosted and local, one of the locations in a 3-2-1 strategy. BackBlaze would work for an offsite but I already have that portion covered.

      • deegeese@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        2
        ·
        9 months ago

        that doesn’t require I keep a full local copy of all the data

        So you want a local self hosted backup, but also not a full copy? So like backup only recently changed files?

        • jkrtn@lemmy.mlOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          9 months ago

          I want like one local device to have a full copy, but the devices writing new data into that one do not need a full copy.

              • teawrecks@sopuli.xyz
                link
                fedilink
                English
                arrow-up
                2
                ·
                9 months ago

                I’ve been using TrueNas with a nightly sync to Backblaze for years and I like it.

                It used to be called FreeNas and used FreeBSD. Now the BSD version is called TrueNas Core, and a new Linux based version is called TrueNas Scale.

                I would go with TrueNas Scale if I were starting a new one today. You probably won’t use the “jail” functionality immediately, but they’re super handy, and down the line if you start playing with them, you’ll run into fewer compatibility issues running Linux vs BSD.

              • deegeese@sopuli.xyz
                link
                fedilink
                English
                arrow-up
                2
                ·
                9 months ago

                It’s basically a RAID + File shares like SMB.

                Loads of DIY options, but I use a Synology so I don’t need to mess with anything.

          • ironsoap@lemmy.one
            link
            fedilink
            English
            arrow-up
            2
            ·
            9 months ago

            In technical terms you mean doing an incremental or differential back up to a local network storage location, correct?

            • jkrtn@lemmy.mlOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              9 months ago

              “Incremental” sounds right. I want it to act like rsync without deleting files on the destination, so all the folders are merged. (It would be cool if it kept versions but I don’t absolutely need that.) Tools like Borg or Restic look great, but I have been searching to see if they support this kind of usage and they seem not to.

  • solrize@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    9 months ago

    I use Borg Backup to a Hetzner storage box but doing the same thing to a disk array would work fine. How much data are you talking about? What is the usage picture? Backup and archiving are really not the same thing.

    • jkrtn@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      9 months ago

      I was looking at Borg but that’s one of the tools where it seems like I need the entire replicated copy of the dataset locally to add more. I believe Borg can open a view into previous versions of the data, so it’s technically append only, but I’d find that process tedious.

      These are a couple TB and mostly photos I’ve taken. I’d like to be able to browse and edit at some point, but my primary concern right now is keeping a copy of everything.

      • solrize@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        9 months ago

        Yeah that’s more of an archive than a backup scenario. I have a small self hosted Nextcloud that I use for stuff like that. For a few TB, you might consider Hetzner Storage Cloud which is really Nextcloud. It is backed up daily which is a help.

        • jkrtn@lemmy.mlOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          9 months ago

          How was it setting up and running Nextcloud? I’m very curious about their office software, looks fun.

          • solrize@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            9 months ago

            As I remember, setting it up was kind of a pain, but once runnnig it hasn’t neded attention. I don’t use the fancy apps. Also, by now there might be an apt package or docker container or something of that sort. I haven’t used their fancy apps much. My main use of it is to upload photos from my phone so I can access them from other devices.

  • hperrin@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    9 months ago

    All of my machines back up to my home server’s RAID over WebDAV with Nephele.

    Then every few days I’ll manually sync them to a server at my parents’ house with a single huge HDD using rsync. I do this manually so that if anything happens to my home server (like ransomware) it doesn’t mirror destroyed data.

    Since the Nephele share is just WebDAV, I can mount it locally and move things into it that I don’t want local anymore.

    I created Nephele, and I just finished writing an encryption plugin. I wrote it because I’m also going to write an S3 adapter. That way, you can store things in S3, but they’ll be encrypted, so Amazon can’t see them.

    • jkrtn@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 months ago

      This is really cool. I ended up trying something similar: serving from a ZFS pool with SeaweedFS. TBD if that’s going to work for me long term.

      I would definitely be able to manually sync the SeaweedFS files with rsync to another location but from what I see it requires me to use their software to make sense of any structure. I might be able to mount it and sync that way, hopefully performance for that is not too bad.

      Syncing like that and having more control over where the files are placed on the RAID is very cool.

      • hperrin@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        I’m assuming I would notice, because none of my services on the machine would work anymore.

      • jkrtn@lemmy.mlOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 months ago

        Protection against if it happens and they have not noticed within those few days. Probably especially important if they leave the system running while on vacation.

  • atzanteol@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    9 months ago

    Sounds like something like “git annex” is what you’re looking for?

    I use this to manage all my photos. It lets you add binaries and synchronize then to a backend server (can be local, can be s3, back blaze, etc).

    You can then “drop” files and it ensures a remote exists first. And when you drop the file your still see a symlink of it locally (it’s broken) so that you know it exists.

    My workflow is to add my files, sync them to both a local server and b2, then I drop and fetch folders as i need (need disk space? “git annex drop 2022*”, want to edit some photos? “git annex get 2022_10_01”.

  • JakenVeina@lemm.ee
    link
    fedilink
    English
    arrow-up
    3
    ·
    9 months ago

    rsync, for sure. That’s what I used when I had to migrate a 10TB datastore to a new machins.

    • jkrtn@lemmy.mlOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 months ago

      That’s top of my list for moving the files if I do an S3 or WebDAV backend. I’m overthinking this, aren’t I? Just find a WebDAV server, set it up, use rclone to append files and pretty much everything else will be able to browse.

      • YurkshireLad@lemmy.ca
        link
        fedilink
        English
        arrow-up
        2
        ·
        9 months ago

        Haha it’s easy to overthink things sometimes. I’m guilty of that. I’m using SFTPGo at home to serve files from a small server.

  • computergeek125@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    9 months ago

    What platform?

    Another user said it - what your asking for isn’t a backup, it’s just data transfer.

    It sounds like you’re looking for a storage backend that hosts all your data and can download data to the client side on the fly.

    If your use case is Windows, Nextcloud Desktop may be what you looking for. I have a similar setup with the game clips folder. It detects changes and auto uploads then, while deleting less recently used data that’s properly server side. This feature might be in Mac but I haven’t tested it.

    Backup wise, I capture an rsync of the nextcloud database and filesystem server-side and store it on a different chassis. That then gets backed up again to a USB drive I can grab and run.

    Nextcloud also supports external storage, which the server directly connects to: https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/external_storage_configuration_gui.html

  • francisfordpoopola@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    9 months ago

    Where will the target be? Online or local? Rsync is really easy to use and the target files are browse-able. I could be too dense but I find online buckets aren’t easily browse-able. Even a homemade NAS might be a good choice and it’s easily scalable.