How should I properly document my homelab?

enchantedgoldapple@sopuli.xyz · edit-2 4 months ago

How should I properly document my homelab?

comrade_twisty@feddit.org · edit-2 4 months ago

Everyone will have their own system.

I save all my credentials in Bitwarden/Vaultwarden and take notes in Joplin.

The good thing about YOUR homelab is that YOU’RE taking notes solely for YOURSELF and only YOU know how YOU work and how YOU organize YOUR thoughts.

irmadlad@lemmy.world · 4 months ago

I save all my credentials in Bitwarden/Vaultwarden

Yeah, I don’t put key phrases, passwords, etc in my notes.

MonkeMischief@lemmy.today · 3 months ago

The good thing about YOUR homelab is that YOU’RE taking notes solely for YOURSELF and only YOU know how YOU work and how YOU organize YOUR thoughts.

Normally I’d agree, in that it’s not some corporate production environment, but also I personally want to document my self hosted setup in a kind of document that can at least be accessed and understood by my closest family, if something were to happen to me.

Convincing them to archive stuff on my Nextcloud instance for example, and them losing access because I’m not around, temporarily or permanently, would spoil the whole point of the endeavor.

comrade_twisty@feddit.org · edit-2 3 months ago

I‘m a realist in that regard, once I am gone all my hard work will go to shit in 6-12 months no matter how much I train an instruct my friends and family. They just don’t care enough to put in even a little bit of work.

PullPantsUnsworn@lemmy.ml · 4 months ago

Ansible and Nix. Code is the document.

white_nrdy@programming.dev · 4 months ago

I’ve been in the process of migrating everything over do Nix. Love it so much.

What hole does Ansible fill for you? I haven’t looked into it in the past really, so just curious. I have a single Paoxmox node so don’t really need horizontal scaling orchestration.

PullPantsUnsworn@lemmy.ml · 3 months ago

I don’t use NixOS for my home server mainly because of lack of MAC (SELinux or AppArmor). I use Ansible to configure AlmaLinux from package installation to firewall to systemd services.

I use NixOS for desktop and development machines.

imnotroot@lemmy.ml · edit-2 3 months ago

How do you upgrade your AlmaLinux from version X to Y? Do you install a new instance or do you upgrade it? I’m asking because I remember that RedHat recommendations is to reinstall.

mathuin@lemmy.world · 4 months ago

I agree with the advice that says “Document your setup such that you could recreate it from your notes from scratch” but I’d take it another step further — consider that someone may have to do some work on your system when you are unable or unavailable. The kind of thing you’d keep with your will, or power of attorney. Just a suggestion.

irmadlad@lemmy.world · 4 months ago

…and to my family I bequeath my entire collection of Linux iso’s

mathuin@lemmy.world · 4 months ago

You jest but if I left my wife my Home Assistant setup undocumented she would pee on my grave.

irmadlad@lemmy.world · 4 months ago

LOL, well I’m single tho I’ve known my ladyfriend for over 40 years. I offered to set up a server at her house, and connect the two, but she has no interest rifling through all my lab for anything of interest in the case of my passing.

mathuin@lemmy.world · 4 months ago

I’m happily married with a kid, and we recently went through the estate planning process. When I brought up IP stuff and digital properties, their advice was pretty much “Hmm… you should pick someone who understands what you’re talking about, get their approval in advance, and then add them as your legacy contacts and document the heck out of everything”. Realistically nobody is going to want my GitHub stuff or anything like that, but I would like my kid to have access to most* of my files after I pass. I am of course excluding the kind of content that “real friends” delete while your body is still warm.

irmadlad@lemmy.world · 4 months ago

It’d be nice to donate all my equipment to some kid who is very interested. That would be something I’d be interested in.

mathuin@lemmy.world · 4 months ago

My documented plan includes that kind of donation for my amateur radio equipment, but I’m going to let my survivors handle the home lab.

osaerisxero@kbin.melroy.org · 4 months ago

I believe it is traditional to do so written in blood in the style of an apocalypse log, dealer’s choice for who’s blood. Make sure it’s disjointed and nearly incomprehensible, but that everything is there.

Bonus points if you print the config files and write your documentation on them after stapling them to the walls

irmadlad@lemmy.world · edit-2 4 months ago

Document everything as if it were a step by step tutorial you will give to someone so that they can duplicate your deployment without any prior knowledge. I’ll even include urls to sites I consulted with to achieve production deployment.

ETA: I absolutely care nothing about points. Up voting and down voting used to be a way to weed out bad info. So it always leaves me wondering 'Did I give erroneous advice? What was the reason for the down vote? I mean, if you down voted and said ‘I down voted you because I hate your guts’, I can deal with that.

fruitycoder@sh.itjust.works · 4 months ago

This is what I like about git ops and infra/config as Code personally.

Ideally everything is an a tofu/ansible/helm chart and git lab pipeline/Fleet job. I add comments for anything that I had to learn to make work to those files. Follow good commit hygenine (most of the time). And bam I can almost a year later half asleep stumble back into a thing I did.

howrar@lemmy.ca · 4 months ago

Do you use this for physical machines too?

fruitycoder@sh.itjust.works · 4 months ago

Yep! Metal3 for servers with BMCs Tinkerbell for everything else.

I also have an ansible playbook that templates everything into a cloud init scripts as a boot strap server.

About 12 nodes in total now, from new servers to freebee junk laptops in it.

hoppolito@mander.xyz · 3 months ago

Interesting, so Metal3 is basically kubernetes-managed baremetal nodes?

Over the last years I’ve cobbled together a nice Ansible-driven IaC setup, which provisions Incus and Docker on various machines. It’s always the ‘first mile’ that gets me struggling with completely reproducible bare-metal machines. How do I first provision them without too much manual interference?

Ansible gets me there partly, but I would still like to have e.g. the root file system running on btrfs which I’ve found hard to accomplish with just these tools when first provisioning a new machine.

fruitycoder@sh.itjust.works · 3 months ago

Yep! It uses open stacks Ironic under the hood, but tracks config and stack via k8s.

For OS building I’ve been moving to Elemental which builds OS images from container images and cloud init scripts into Suse Micro immutable OSs (which use btrfs for the snapshot management under the hood for updates).

wersooth@lemmy.world · edit-2 4 months ago

I have a repo for the infra files (compose files and terraform files just for playing). I store the docs in the same repo in MD files. As for the secrets, I’m using docker swarm, so I can store the needed passwords there. otherwise Vaulwarden is my go to, <ad> self hosted, lightweight password manager, compatible with bitwarden clients </ad> I’m a little paranoid if the note-service got db corruptions, I might loose too much info, so git is the way (personal opinion).

edit: add the related MD file next to the compose file, one folder per service, the source and the doc will be coupled in one place.

henfredemars@infosec.pub · 4 months ago

I have a simple pile of Markdown files that I edit with Obsidian. I like the simple text file format because it keeps my documentation forwards-compatible. I use OpenWRT at the heart of my network, so I keep I right there in root’s home. Every long while I back it up to my general Documents which is then synced between my high-storage devices with SyncThing.

enchantedgoldapple@sopuli.xyz · 4 months ago

Thanks for your response. I already have Joplin synced with my server as a solution for my documentation. However I meant to ask how you structure your documentation, know what and how to mention, and organise it for future reference.

pepperprepper@lemmy.world · edit-2 4 months ago

I created something similar to this. It got a lot of love during interviews later down the line. https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fi.redd.it%2Fvmd34mabi4r91.jpg&f=1&ipt=2dde77fd04d48156bc514ad4b1f090c8473f4e666ead0e16906eeed55a79aca6

irmadlad@lemmy.world · 4 months ago

Dude that is a respectable lab you have there! Much envy

enchantedgoldapple@sopuli.xyz · 3 months ago

That is a behemoth of a homelab you have set up there. My jaw would’ve dropped out if it could.

confusedpuppy@lemmy.dbzer0.com · 4 months ago

I have two systems that sort of work together.

The first system involves a bunch of text files for each task. OS installation, basic post OS installation tasks and a file for each program I add (like UFW, apparmor, ddclient, docker and so on). They basically look like scripts with comments. If I want to I can just copy/paste everything into a terminal and reach a a specific state that I want to be at.

The second system is a sort of “skeleton” file tree that only contains all the files that I have added or modified.

Here's an example of what my server skeleton file tree looks like

.
├── etc
│   ├── crontabs
│   │   └── root
│   ├── ddclient
│   │   └── ddclient.conf
│   ├── doas.d
│   │   └── doas.conf
│   ├── fail2ban
│   │   ├── filter.d
│   │   │   └── alpine-sshd-key.conf
│   │   └── jail.d
│   │       └── alpine-ssh.conf
│   ├── modprobe.d
│   │   ├── backlist-extra.conf
│   │   └── disable-filesystems.conf
│   ├── network
│   │   └── interfaces
│   ├── periodic
│   │   └── 1min
│   │       └── dynamic-motd
│   ├── profile.d
│   │   └── profile.sh
│   ├── ssh
│   │   └── sshd_config
│   ├── wpa_supplicant
│   │   └── wpa_supplicant.conf
│   ├── fstab
│   ├── nanorc
│   ├── profile
│   └── sysctl.conf
├── home
│   └── pi-user
│       ├── .config
│       │   └── ash
│       │       ├── ashrc
│       │       └── profile
│       ├── .ssh
│       │   └── authorized_keys
│       ├── .sync
│       │   ├── file-system-backup
│       │   │   ├── .sync-server-fs_01_root
│       │   │   └── .sync-server-fs_02_boot
│       │   └── .sync-caddy_certs_backup
│       ├── .nanorc
│       └── .tmux.conf
├── root
│   ├── .config
│   │   └── mc
│   │       └── ini
│   ├── .local
│   │   └── share
│   │       └── mc
│   │           └── history -> /dev/null
│   ├── .ssh
│   │   └── authorized_keys
│   ├── scripts
│   │   ├── automated-backup
│   │   └── maintenance
│   ├── .ash_history -> /dev/null
│   └── .nanorc
├── srv
│   ├── caddy
│   │   ├── Caddyfile
│   │   ├── Dockerfile
│   │   └── docker-compose.yml
│   └── kiwix
│       └── docker-compose.yml
└── usr
    └── sbin
        ├── containers-down
        ├── containers-up
        ├── emountman
        ├── fs-backup-quick
        └── rtransfer

This is useful to me because I can keep track of every change I make. I even have it set up so I can use rsync to quickly chuck all the files into place after a fresh install or after adding/modifying files.

I also created and maintain a “quick install” guide so I can install a fresh OS, rsync all the modified files from my skeleton file tree into place, then run through all the commands in my quick install guide to get myself back to the same state in a minimal amount of time.

cecilkorik@lemmy.ca · 4 months ago

You’re on the right track. Like everything else in self-hosting you will learn and develop new strategies and scale things up to an appropriate level as you go and as your homelab grows. I think the key is to start with something immediately achievable, and iterate fast, aiming for continuous improvement.

My first idea was much like yours, very traditional documentation, with words, in a document. I quickly found the same thing you did, it’s half-baked and insufficient. There’s simply no way to make make it match the actual state of the system perfectly and it is simply inadequate to use English alone to explain what I did because that ends up being too vague to be useful in a technical sense.

My next realization was that in most cases what I really wanted was to be able to know every single command I had ever run, basically without exception. So I started documenting that instead of focusing on the wording and the explanations. Then I started to feel like I wasn’t capturing every command reliably because I would get distracted trying to figure out a problem and forget to, and it was duplication of effort to copy and paste commands from the console to the document or vice versa. That turned into the idea of collecting bunches of commands together into a script, that I could potentially just run, which would at least reduce the risk of gaps and missing steps. Then I could put the commands I wanted to run right into the script, run the script, and then save it for posterity, knowing I’d accurately captured both the commands I ran and the changes I made to get it working by keeping it in version control.

But upon attempting to do so, I found that just a bunch of long lists of commands on their own isn’t terribly useful so I started to group all the lists up, attempting to find commonalities by things like server or service, and then starting organize them better into scripts for different roles and intents that I could apply to any server or service, and over time this started to develop into quite a library of scripts. As I was doing this organizing I realized that as long as I made sure the script was functionally idempotent (doesn’t change behaviors or duplicate work when run repeatedly, it’s an important concept) I can guarantee that all my commands are properly documented and also that they have all been run – and if they haven’t, or I’m not sure, I can just run the script again as it’s supposed to always be safe to re-run no matter what state the system is in. So I started moving more and more to this strategy, until I realized that if I just organized this well enough, and made the scripts run automatically when they are changed or updated, I could not only improve my guarantees of having all these commands reliably run, but also quickly run them on many different servers and services all at once without even having to think about it.

There are some downsides of course, this leaves the potential of bugs in the scripts that make it not idempotent or not safe to re-run, and the only thing I can do is try to make sure they don’t happen, and if they do, identify and fix these bugs when they happen. The next step is probably to have some kind of testing process and environment (preferably automated) but now I’m really getting into the weeds. But at least I don’t really have any concerns that my system is undocumented anymore. I can quickly reference almost anything it’s doing or how it’s set up. That said, one other risk is that the system of scripts and automation becomes so complex that they start being too complex to quickly untangle, and at that point I’ll need better documentation for them. And ultimately you get into a circle of how do you validate the things your scripts are doing are actually working and doing what you expect them to do and that nothing is being missed, and usually you run back into the same ideas that doomed your documentation from the start, consistency and accuracy.

It also opens an attack vector, where somebody gaining access to these scripts not only gains all the most detailed knowledge of how your system is configured but also the potential to inject commands into those scripts and run them anywhere, so you have to make sure to treat these scripts and systems like the crown jewels they are. If they are compromised, you are in serious trouble.

By now I have of course realized (and you all probably have too) that I have independently re-invented infrastructure-as-code. There are tools and systems (ansible and terraform come to mind) to help you do this, and at some point I may decide to take advantage of them but personally I’m not there yet. Maybe soon. If you want to skip the intermediate steps I did, you might even be able to skip directly to that approach. But personally I think there is value in the process, it helps defining your needs and building your understanding that there really isn’t anything magical going on behind the scenes and that may help prevent these tools from turning into a black box which isn’t actually going to help you understand your system.

Do I have a perfect system? Of course not. In a lot of ways it’s probably horrific and I’m sure there are more experienced professionals out there cringing or perhaps already furiously warming up their keyboards. But I learned a lot, understand a lot more than I did when I started, and you can too. Maybe you’ll follow the same path I did, maybe you won’t. But you’ll get there.

CaptainPedantic@lemmy.world · edit-2 4 months ago

I’ve got a bunch of notes in Trilium.

I have a note for each service with the docker compose file, notes on backups, any weirdness with the setup, and when I update each service. I use Trilium as a crappy version control for the compose file.

I also have a note for the initial setup of my server (mostly setting up docker, setting up mergerfs and snapraid).

Other than that, I have one note for each device for my setup. (Wifi AP, OPNsense router, switch, etc) That I populate with random crap I might need to know later.

Gonzako@lemmy.world · 3 months ago

If you can’t remember what something does, cut it off. If you know remember it, put it back on the document it.

No_Bark@lemmy.dbzer0.com · edit-2 4 months ago

I’ve been documenting my homelab experiments, set ups, configurations, how-to’s, etc in both Trilium and Silverbullet. I use Silverbullet more as a wiki and Trilium for journal style notes. I just got into self hosting earlier this year, so I’m by no means an expert or authority on any of this.

So my Silverbullet set up contains most of my documentation on how to get things set up. I have sections for specific components of the homelab (Proxmox general set up, general networking, specific how tos for getting various VMs and LXCs set up for specific applications, specific how tos on getting docker stacks up and running, etc.)

I didn’t document shit the first two times I set up and restarted my entire homelab, but by the third time I learned. And from there I basically just wrote down what I did to get things running properly, and then reviewed the notes afterword to make sure I understood what I wrote. This is never a perfect process, so in the following attempts of resetting my server, I’ve updated sections or made things more clear so that when I’m coming at this 8 months later I can follow my guide fully and be up and running.

Some of my notes are just copy pasted directly from tutorials I originally followed to get things set up. This way I just have an easily accessible local copy.

When I troubleshoot something, I document the steps I take in Trilium using the journal feature, so I can easily track the times and dates of when I did what. This has helped me out immensely because I forget what the fuck I did the week before all the time.

I learned all this through trial and error. You’ll figure out what needs to be documented as you go along, so don’t get too caught up trying to make sure you have a perfect documentation plan in place before deploying anything.

I’m one of those people who never really took notes on things or wrote shit down for most my life. Mostly because I’ve been doing shit that doesn’t require extensive documentation, so it was a big learning curve.

Edit: Forgot to mention that I also have a physical paper journal that I’ve scrawled various notes in. I found it easier to take quick notes on paper while I’m in the middle of working on something, then I transcribe those notes digitally in either Silverbullet or trilium.

frongt@lemmy.zip · 4 months ago

That’s the neat part, I don’t!

I have a docker-compose file, which is somewhat self-documenting, especially since I give everything descriptive names. Creds go in bitwarden anyway.

But then, my environment isn’t that complex, and I don’t have anything so custom that I need notes to replicate it.

BeardedGingerWonder@feddit.uk · 3 months ago

I felt like we were brothers for the entire first sentence, then you had to ruin it.