I have quite an extensive collection of media that my server makes available through different means (Jellyfin, NFS, mostly). One of my harddrives has some concerning smart values so I want to replace it. What are good harddrives to buy today? Are there any important tech specs to look out for? In the past I didn’t give this too much attention and it didn’t bite me, yet. But if I’m gonna buy a new drive now, I might as well…

I’m looking for something from 4TB upwards. I think I remember that drives with very high capacity are more likely to fail sooner - is that correct? How about different brands - do any have particularly good or bad reputation?

Thanks for any hints!

  • Avid Amoeba@lemmy.ca
    link
    fedilink
    English
    arrow-up
    35
    ·
    edit-2
    13 days ago

    Buy recertified enterprise grade disks from https://serverpartdeals.com. Prices were around $160/16TB the last time I checked. Mix brands and models to reduce simultaneous failure. Use more than 1-disk redundancy. If you can’t buy from SPD, either find an alternative or buy external drives and shuck them. Use ZFS to know if your data is correct. I’ve been dealing with funny AMD USB controllers recently and the amount of silent data corruption I’d have gotten if not for ZFS is ridiculous.

    • pedroapero@lemmy.ml
      link
      fedilink
      English
      arrow-up
      4
      ·
      12 days ago

      I use BTRFS for the same. Being able to check for and repair silent corruptions is a must (and this is without needing to read the whole drives, only the actual files). I’ve had a lot of them over the years, including (but not only) because of a cheap USB controller also.

    • Pacmanlives@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      12 days ago

      Holy cow these are way cheaper than anything I have seen before. I am in a RAID 5 setup so if a disk or two dies I am okay.

      • Avid Amoeba@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        12 days ago

        If you can, move to a RAID-equivalent setup with ZFS (preferred in my opinion) in order to also know about and fix silent data corruption. RAIDz1, RAIDz2 would do the equivalent to RAID5, RAID6. That should eliminate one more variable with cheap drives.

        • Pacmanlives@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          12 days ago

          ZFS is a no go for me due to not being able to add larger disk and then expand my pool size on the fly. MDADM and LVM+XFS have treated me well the past few years. I started with an 12tb pool and now over 50 tb pool

          • Avid Amoeba@lemmy.ca
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            12 days ago

            Not that I want to push ZFS or anything, mdraid/LVM/XFS is a fine setup, but for informational purposes - ZFS can absolutely expand onto larger disks. I wasn’t aware of this until recently. If all the disks of an existing pool get replaced with larger disks, the pool can expand onto the newly available space. E.g. a RAIDz1 with 4x 4T disks will have usable space of 12T. Replace all disks with 8T disks (one after another so that it can be done on the fly) and your pool will have 24T of space. Replace those with 16T and you get 48T, and so on. In addition you can expand a pool by adding another redundant topology just like you can with LVM and mdraid. E.g. 4x 4T RAIDz1 + 3x 8T RAIDz2 + 2x 16T mirror for a total of 44T. Finally, expanding existing RAIDz with additional disks has recently landed too.

            And now for pushing ZFS - I was doing file based replication on a large dataset for many years. Just going over all the hundreds of thousands of dirs and files took over an hour on my setup. That’s then followed by a diff transfer. Think rsync or Syncthing. That’s how I did it on my old mdraid/LVM/Ext4 setup, and that’s how I continued doing on my newer ZFS setup. Recently I tried using ZFS send/receive which operates within the filesystem. It completely eliminated the dataset file walk and stat phase since the filesystem already knows all of the metadata. The replication was reduced to just the diff file transfer time. What used to take over an hour got reduced to seconds or minutes, depending on the size of the changed data. I can now do multiple replications per hour without significant load on the system. Previously it was only feasible overnight because the system would be robbed of IOPS for over an hour.

            • Pacmanlives@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              11 days ago

              I wonder if that’s a new feature. IIRC the issue was with vdevs in ZFS in the pool expansion. I am a FreeBSD user and do have some jails running. I do like ZFS a lot it’s way more mature then BTRFS on the Linux

    • anamethatisnt@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      10 days ago

      Interesting that Toshiba/Seagate has best 16TB stats and WDC bad ones in comparison, but for 14TB it’s reversed. My homelab disks apparently has 0.71% risk of dying after 22 months (seagate exos x16 st16000nm001g).
      edit: WDC does good in 16TB too, their only outlier there could be due to low number of disks in deive count. And the same is true when checking total no of disks for 14TB.

      • roofuskit@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 days ago

        Those 14TB WD drives are workhorses. I run refurbished ones in my home server and have never had any issues. And they are significantly faster than the rest of my spinning rust drives.

  • Ugurcan@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    12 days ago

    One thing no one will tell you HOW LOUD some HDDs could get under load. You may not want any of those disks around if you’re keeping your server around your living spaces.

    Just check dB values in the spec sheets.

    • Ryan@discuss.tchncs.deOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      12 days ago

      That’s a good hint, although I wouldn’t mind too mich. personally. My server is located in the basement.

    • yonder@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      12 days ago

      Depending on the use, you may be able to spin then down when not in use, but that’s not always possible for some applications.

  • Appoxo@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    4
    ·
    12 days ago

    My last I have bought are the Toshiba N300 15tb helium drives.
    Didnt write much to it but they were cheap and seemed quiet enough to have around in my room (where I also sleep)

      • Appoxo@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        12 days ago

        I have and while they sure are loud, dampening the NAS with foam tape (had some double adhesive tape from buying LED strips laying around) quietened it enough to be managable.

  • Nibodhika@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    12 days ago

    One important thing, ensure the drive is CMR, the reason is that you likely want a RAID, and non-CMR disks take so long to read the entire disk that the chances of a second failure while recovering from a disk failure is significant.

    That being said, how are you keeping track of the disks state? I built my RAID recently, and your post made me realize that I have nothing to notify me if one of the disks shows early signs of problems.

    • DeathByDenim@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      12 days ago

      I just use the built-in email function that comes with mdadm. If a drive fails, I’ll know right away and replace it with a spare. You do need your server to be able to send emails with something like postfix.

      If you have hardware RAID, there’s often a monitoring tool that comes with it or at the very least a command-line utility that can report the RAID state which you can then use in a script.

    • Ryan@discuss.tchncs.deOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      12 days ago

      I don’t keep track actively. I noticed problems when reading a file and looked at the drive with smartctl for that reason. Does anybody know how to keep track actively?

  • geography082@lemm.ee
    link
    fedilink
    English
    arrow-up
    3
    ·
    13 days ago

    I have an external usb hdd , wd passport 3TB from 10 years ago (healthy) connected to a Chinese N100 mini pc. I have proxmox on it, 5 lxc containers, 30 docker containers running apps, plex, calibre web.

  • 𝘋𝘪𝘳𝘬@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    3
    ·
    13 days ago

    I’m looking for something from 4TB upwards.

    If you say “harddrive” … do you mean actual harddrives or are you using it synonymous with “storage”? If you really talk about actual harddrives, it’s hard to even find datacenter/server harddrives below 4 TB. Usually server HDDs start with 8 or 12 TB. You can even find HDDs with 20 TB - Seagate Exos series for example, starting at around 360 Euros (ca. 400 USD).

    If you’re in for a general storage, preferably SSD, that’s another issue. There is the Samsung 870 QVO (8 TB) SSD that is often advertised as “datacenter SSD” (so I assume it would run well in a server that is active 24/7), but it is currently available with a maximum of 8 TB. The 870 QVO is at ca. 70 Euros per terabyte (ca. 77 USD) which, in my experience, is the current price range for SSDs. So it has a high price seen from the outside but it’s actually fine. It’s also a one-time investment.

    For selfhosting I’d go with an SSD-only setup.

    do any have particularly good or bad reputation?

    From personal experience I’d say, stick with the “larger” brands like Samsung or Seagate.

    • BearOfaTime@lemm.ee
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      13 days ago

      SSD only?

      Look at Mr Moneybags over here. That would increase my cost about 400%

      And no, I wouldn’t recoup that in energy cost reductions, as my oldest NAS with ancient drives only draws a few watts 97% of the time.

      • 𝘋𝘪𝘳𝘬@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        2
        ·
        13 days ago

        Sorry, I can’t hear you under my enormous piles of money! 🙃

        But yeah. You should do an SSD-only setup if this is within your budget. I assume that for most of us selfhosting is just some soft of hobby. If you’re willing to spend money on the latest and cooles tech: do it. If not, then it’s fine, too.

      • 𝘋𝘪𝘳𝘬@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        13 days ago

        Okay, so … then maybe really look into the Seagate Exos drives. 20 TB should be pretty much fine for most selfhosting adventures.

        • e0qdk@reddthat.com
          link
          fedilink
          English
          arrow-up
          5
          ·
          13 days ago

          I have a few of those, and while the ones I bought have worked out fine so far, I think it’s worth cautioning people that they are annoyingly loud doing basic operations.

          • Ryan@discuss.tchncs.deOP
            link
            fedilink
            English
            arrow-up
            2
            ·
            13 days ago

            that wouldn’t be a problem for me, as my server is located in the basement. But good to know!

          • 𝘋𝘪𝘳𝘬@lemmy.ml
            link
            fedilink
            English
            arrow-up
            2
            ·
            13 days ago

            Absolutely. They’re advertised for being used in datecenters, so I assume noise optimization wasn’t a concern for Seagate when creating those drives.

    • 486@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      12 days ago

      I would advice against using SSDs for storage of media and such. Not only because of their higher price, but also because flash memory cells tend to fade over time, causing read speeds to decrease considerably over time. This is particularily the case for mostly read-only workloads. For each read operation the flash memory cell being read loses a bit of its charge. Eventually the margin for the controller to be able to read the data will be so small, that it takes the controller lots of read operations to figure out the correct data. In the worst case this can lead to the SSD controller being unable to read some data alltogether.