Hi Lemmy! First post, apologies if it’s not coherent :)
I have a physical home server for hosting some essential personal cloud services like smart home, phone backups, file sharing, kanban, and so. I’m looking to re-install the platform as there are some shortcomings in the first build. I loosely followed the FUTO wiki so you may recognise some of the patterns from there.
For running this thing I have a mini-pc with 3 disks, 240GB and 2x 960GB SSDs. This is at capacity, though the chassis and motherboard would in theory fit a fourth disk with some creativity, which I’m interested to make happen at some point. I also have a Raspberry Pi in the house and a separate OPNsense box for firewall/dns blocking/VPN etc that works fine as-is.
In the current setup, I have Ubuntu Server on the 240GB disk with ext4, which hosts the services in a few VMs with QEMU and does daily snapshots of the qcow2 images onto the 960GB SSDs which are set up as a mirrored zfs pool with frequent automatic snapshots. I copy the zpool contents periodically to an external disk for offsite backup. There’s also a simple samba share set up on the pool which I thought to use for syncthing and file sharing somehow. This is basically where I’m stopping to think now if what I’m doing makes sense.
Problems I have with this:
- When the 240GB disk eventually breaks (and I got it second hand so it might be whatever), I might lose up to one day of data within the services such as vikunja, since their data is located on the VMs, which are qcow2 files on the server’s boot drive and only backed up daily during the night because it requires VM shutdown. This is not okay, I want RPO of max 1 hour for the data.
- The data is currently not encrypted at rest. The threat model here is data privacy in case of theft.
Some additional design pointers:
- Should be able to reboot remotely in good weather.
- I want to avoid any unreliable or “stupid” configurations and not have insane wear on my SSDs.
- But I do want the shiny snapshotting and data integrity features of modern filesystems for especially my phone’s photo feed.
- I wish to avoid btrfs as I have already committed to zfs elsewhere in the ecosystem.
- I may want to extend the storage capacity later with mirrored HDD bulk storage.
- I don’t want to use QEMU snapshots for reaching the RPO as it seems to require guest shutdown/hibernation to be reliable and just generally isn’t made for that. I’m really trying to make use of zfs snapshots like I already do on my desktop.
My current thoughts revolve around the following - comments most welcome.
- Ditch the 240GB SSD from the system to make space for a pair of HDDs later. So, the 960GB pair would have both boot and data, somehow. (I’m open to having a separate NAS later if this is just not a good idea)
- ZFS mirror w/ zfs-auto-snapshot + ZVOLs + ext4 guests? Does this hurt the SSDs?
- Or: ext4 mdadm raid1 + qcow2 guests running zfs w/ zfs-auto-snapshot? Does this make any sense at all?
- ZFS mirror + qcow2 + ext4 guests? This destroys the SSDs, no?
- In any case, native encryption or LUKS?
- Possibly no FDE, but dataset level encryption instead if that makes it easier?
- I plan to set up unattended reboots with the Pi as key server running something like Mandos. Passphrase would be required to boot the server only if the Pi goes down as well. So, any solution must support using a key server to boot.
- What FS should the external backup drives have? I’m currently leaning into ZFS single disk pools. Ideally they should be readable with a mac or windows machine.
- Does Proxmox make things any easier compared to Ubuntu? How?
- I do need at least one VM for home assistant in any case. The rest could pretty much all run in containers though. Should I look into this more or keep the VM layer?
I’m not afraid to do some initially complex setting up. I’m a full stack web developer, not a professional sysadmin though, so advice is welcome. I don’t want to buy tons of new shit, but I’m not severely budget limited either. I’m the only admin for this system but not the only user (family setting).
What’s the 2025 way of doing this? I’m most of all looking any inspiration as to the “why”, I can figure out ways to get it done if I see the benefits.
tldr: how to best have reliable super-frequent snapshots of a home server’s data with encryption, preferably making use of zfs.
- Don’t fret about ssd lifespan, unless you are planning on writing tb a day they will outlive your setup. I wouldn’t personally use zfs for this, unless you have a lot of memory just laying around. - Fair about the SSD life. How would you go about achieving the frequent backups without zfs? I wouldn’t want to implement it separately for every app I use, though I’m open to it if this doesn’t work out. - I’ll easily buy more memory if needed, the box now has 8GB and isn’t struggling in any way. - I won’t use fs snapsots as backups especially one as poorly supported on linux as zfs. I would go with external qcow disk snapshots and they can be pretty easily automated. - This is what I’m doing currently, but it’s not really feasible to have the services shut down hourly for snapshots. This is indeed why I started looking towards filesystem-level snapshotting Obviously I will have other types of backups as well, I’m simply looking to have the on-the-fly immutable snapshot capability here somehow. - You do not need to shut down services to make snapshots, why would you? - Uhh, from most what I have gathered from self-hosting so far, doing that is not trivial as you’d need to flush the ram contents to disk first basically. I’m starting to realize though that the same holds equally for filesystem level snapshotting. What I’m really after is making my data live on separate pass through storage that has all the fancy filesystem level stuff so I can just relax about the VM backups. - You are overthinking it, without flushing ram everything works fine. OS inside VM would just boot as normal and that’s it. 
 
 
 
 
 
 
- ZFS, hands down, it doesn’t even begin to hurt the SSDs, it’s basically the best choice, just try to not fill the whole volumes or it starts thrashing like crazy. - ZFS has encryption, but LUKS is fine too. - I’ve run Raidz2 for well over a decade, never had data loss that wasn’t extremely my fault, and I recovered from that almost immediately from backed up snapshot. - Thanks! Can I ask what is your setup like? ZFS on bare metal? Do you have VMs? - Zfs on Debian on bare metal with nfs server. Edit: and it hosts the worker vms - Vlan for services with routed subnet - Sriov connectx4 with 1 primary vm running freebsd and basically all my major services in their own jails. Won’t go into details, but it has like 20 jails and runs almost everything. (had full vnet jails for a while which was really cool but performance wasn’t great). - 1 vm for external nginx and bind on Debian vm on isolated subnet/Vlan and dmz for exposed services - 1 vm for mailinabox on dmz subnet/Vlan - 1 Debian vm on services vlan/net for apps that don’t play well with freebsd, mostly dockers, I do not like this vm, it’s basically unclean and mostly isolated. - Few other vms for stuff. - It’s a Dell r730 with 2 2697(or 2698? 20c/40t each) with 512gb. Edit: v4 so broadwell - 12x16tb hgst h530s with 2 nvme drives and 2 Sata ssds, somewhere in there is a zlog and l2arc. - Can’t figure out how to fit a decent GPU in there so currently it’s living on my dual Rome workstation, this system is due for an upgrade, thinking about swapping the workstation to a much lighter one and push the work to the server, while moving the storage to a dedicated system, but not there yet. - Love freebsd though, don’t use it as my daily driver, tried a bit, it worked but there was just enough trouble to not make it work, but freebsd has moved on and so have i, so it’s worth a shot again. - Decent i/O, but nothing to write home about, think it saturates the 10g but only just, I have gear for full 100g (I do a LOT of chip startups, and worked at a major networking chip firm a while) but it takes a lot more power, and i have PGE so I can’t justify it till I can seriously saturate it. - Also I’m in process of moving to Europe, built a weak network here and linked via wire guard, but shit is expensive here and I’m not sure how to finish the move just yet, so I’m basically 50/50 including time at work in the valley. - Nice. Thanks a lot! Similar in architecture to what I had in mind, so I’m inspired :) - A couple more clarifications, if you will! I’m asking dumb questions as that is the way I learn :D - If your VMs need to access the data, do you then connect it via the nfs share?
- I suppose you have separate backup schemes for the data vs. the VMs?
- Does your bare metal Debian OS indeed run on the zfs pool too or does it have a separate boot disk? If on the pool, what’s that setup like? Is there a LUKS encrypted keystore partition to use with grub, or do you use the zfs boot menu? (I assume your pool is encrypted) -I’m trying to gauge how difficult this install is going to be if I want the OS on the zfs pool…
 - I just found out about virtiofs, and I’m piecing it together now. I haven’t done actual self hosting for long, so the conventions are a bit blurry, I’m basically piecing it together by what others seem to be doing and trying to understand why. I ended up realising I needed a much higher level discussion around this than “which fs should I use”. If you know of any resources that do NOT talk about specific technologies, but rather, principles behind them, I’d gladly bookmark! - So the changes I’m planning to my setup… - encrypt the 2x960GB zfs pool and share it with [samba|virtiofs|nfs] from the host OS (checking later which one is the way to go)
- migrate all meaningful data (like application dbs) to reside on the pool rather than on the VM images and keep this separation of data&application layers to enable different backup schemes for them
- later / if I have the energy: try installing the host OS on the pool as well to get rid of the small SSD and make space for the HDDs.
- edit; also later: have the Pi provide a key server for unattended reboots, though if I simply leave the boot drive unencrypted and keep all the data in the pool this won’t be such an issue anyway, I can just remote in and type the passphrase for the zfs pool to get the data back online.
 - Nfs, it’s good enough, and is how everyone accesses it. I’m toying with ceph or some kind of object storage, but that’s a big leap and I’m not comfortable yet - Zfs snapshot to another machine with much less horsepower but similar storage array. - Debian boots off like a 128gb Sata ssd or something, just something mindless that makes it more stable, I don’t want to f with Zfs root. - My pool isn’t encrypted, don’t consider it necessary, though I’ve toyed with it in th past. Anything sensitive I keep on separate USB keys and duplicate them, and I use luks. - I considered virtiofs, it’s not ready for what I need, it’s not meant for this use case and it causes both security and other issues. Mostly it breaks the demarcation so I can’t migrate or retarget to a different storage server cleanly. - These are good ideas, and would work. I use zvols for most of this, in fact I think I pass through a nvme drive to freebsd for its jails. - Docker fucks me here, the volume system is horrible. I made an lxc based system with python automation to bypass this, but it doesn’t help when everyone releases as docker. - I have a simple boot drive for one reason: I want nothing to go wrong with booting, ever, everything after that is negotiable, but the machine absolutely has to show up. - It has a decent ups, but as I mentioned earlier, I live in San Jose and have fucking pge , so weeks without power aren’t fucking unheard of. I’m away from home so it has to come back after the fairly regular outages. I have some leeway, but my entire infrastructure is on it, so not much. - Aight thank you so much, confirms I’m on the right path! This clarifies a lot, I’ll keep the ext4 boot drive :) - FYI, zfs is pretty fucking fragile, it breaks a lot, especially if you like to keep your kernel up to date. The kernel abi is just unstable and it takes months to catch up. - Which is part of why I don’t trust zfs on root. - Worst case you can sometimes recover with zfs-fuse. - Right, thanks for the heads up! On the desktops I have simply installed zfs as root via the Ubuntu 24.04 installer. Then, as the option was not available in the server variant I started to think maybe that is not something that should be done :p 
 
 
 
 
 
 
 
- Boy. You asked about Proxmox. Nobody said anything. - How does Proxmox make it easier? Have you used it? All sorts of ways. Like, its a full virtual infrastructure management system instead of just an OS. Proxmox loves ZFS. It does many of the things you’ve mentioned here. - Proxmox does have its own backup system that can work with an NFS target or with their smart dedupe storage and replication server product. https://www.proxmox.com/en/products/proxmox-backup-server/overview - You’ve got some pretty advanced ideas and perhaps have already moved beyond the Proxmox question. But if you are curious and haven’t used it, spin up a server and give it a whirl. - I guess I’ll give it a spin. There seems to be a big community around it. I initially thought I might migrate later so keeping the host OS layer as thin as possible. Ubuntu was mainly an easy start as I was familiar with it from before and the spirit in this initiative is DIY over framework - but if there’s a widely used solution for exactly this… Yeah. 
- Proxmox is Debian, so much of your ideas could translate directly across. That said, I try to mod the PVE server as liitle as possible. - Proxmox makes it so easy to spin up yet another VM or LCX to handle services with its core offerings. Also google “proxmox helper scripts” to find tteck’s additional stash of ready-made LCX. 
 
- A wrap-up of what I ended up doing: - Replaced the bare metal Ubuntu with Proxmox. Cool cool. It can do the same stuff but easier / comes with a lot of hints for best practices. Guess I’m a datacenter admin now
- Wiped the 2x960GB SSD pool and re-created it with ZFS native encryption
- Made a TrueNAS Scale VM, passed through the SSD pool disks, shared the datasets with NFS and made snapshot policies
- Mounted the NFS on the Ubuntu VM running my data related services and moved the docker bind mounts to that folder
- Bought a 1Gbps Intel network card to use instead of the onboard Realtek and maxed out the host memory to 16GB for good measure
 - I have achieved: - 15min RPO for my data (as it sits on the NFS mount, which is auto-snapshotted in TrueNAS)
- Encryption at rest (ZFS native)
 - I have not achieved (yet…): - Key fetch on boot. Now if the host machine boots I have to log in to TrueNAS to key in the ZFS passphrase. I will have to make some custom script for this anyway I guess to make it adapt to the situation as key fetching on boot is a paid feature in TrueNAS but it just makes managing the storage a bit easier so I wanna use it now. Disabled auto start on boot for the services VM that depends on the NFS share, so I’ll just go kick it up manually after unlocking the pool in TrueNAS.
 - Quite happy with the setup so far. Looking to automate actual backups next, but this is starting to take shape. Building the confidence to use this for my actual phone backups, among other things. - Oh yeah and I did enable Proxmox VM firewall for the TrueNAS, the NFS traffic goes via an internal interface. Wasn’t entirely convinced by NFS’s security posture when reading about it… At least restrict it to the physical machine 0_0 So I now need to intentionally pass a new NIC to any VM that will access the data, which is neat. 
 
- Ok so wrapping my head around this, what I think I need to be clear about is the separation between applications and data. Applications get the nightly VM snapshot way of backing up, and data will get the frequent zfs snapshots (and other backups). Kinda what I tried to do to begin with, so I will look more on how to do this separation for the applications I intend to use. - Still unsure if samba is the way to go for linking it together on the same physical machine. - Should I just run syncthing on the bare metal host…? Will sleep on it. 
- I have been using TrueNAS for about 3 years now and couldn’t be happier. It can do all of the backup stuff for you as well. I’m not sure if you would be able to use the key server for booting though, but I believe it would check all the other boxes. I don’t currently run VM’s on it (only docker), so not sure what it can do for VM backups. - Regardless of what you end up going with, I’m curious why you are saying you need to shut down the VM to back it up? I’m not familiar with how you are running the VM so not sure if it’s a limitation of the hypervisor, but I would think as long as you can snapshot the disk, you could just back up the snap. It would be crash-consistent rather than application-consistent, but for a backup scenario that should generally be fine. - Right, so my aversion to live backups comes initially from Louis Rossmann’s guide on the FUTO wiki where he mentions it’s non trivial to reliably snapshot a running system. After a lot of looking elsewhere as well I haven’t gotten much hints that it would be bad advice and I want to err on the side of caution anyway. The hypervisor is QEMU/KVM so in theory it should be able to do live snapshots afaik. But I’m not familiar enough with the consistency guarantees to fully trust it. I don’t wanna wake up one day to a server crash and trying to mount the backed up qcow2 in a new system and suddenly it wouldn’t work and I just lost data. - It won’t matter though as I’ll just place all the important data on the zpool and back that up frequently as a simple data store. The VMs can keep doing their nightly shutdown and snapshot thing. - I work for a medium size enterprise as a backup architect. All of our backups are crash consistent and we’ve never had an issue. - Windows has an easy way of dealing with this in the form of VSS. As long as the application supports it, VSS can prepare the system and application for a backup, putting it in an application-consistent state before the snapshot is taken. Unfortunately, there is no equivalent for Linux. The best you can do is pre-freeze and post-thaw scripts to put the application/OS in a backup-ready state. Really though, I wouldn’t worry too much about it. Unless you are running an in-memory database, you really don’t need to worry about application consistency. If you are running an in-memory database, take database level backups (can also be done with pre-freeze/post-thaw scripts) and back up the backups. - Just remember to test whatever solution you end up going with, and make reminders to frequently re-test your backups. You never know what might change in a year’s time, so re-testing periodically is a good way to make sure everything is still functioning properly and make sure your data is still protected. And testing needs to be more than just making sure the VM powers on. Make sure the application can start up and function properly before calling it a successful test. - Always a good reminder to test the backups, no I would not sleep properly if I didn’t test them :p - Aiming to keep it simple, too many moving parts in the VM snapshots / hard to figure out best practices and notice mistakes without work experience in the area, so I’ll just backup the data separately and call it a day. But thanks for the input! I don’t think any of my services have in-memory db’s. 
 
 
 



