zfs migration and cheatsheet
Whilst recently migrating my main computer from Linux to FreeBSD, I also started to Migrate my data. It is not the first time I migrate TB of data from a filesystem to another - however the capabilities of ZFS made me rethink how to manage it (my data).
I've been testing with zfs for a few weeks and literally fell in love with what this fs allows me to do, especially:
- easy atomic syncing of distant compressed, encrypted, filesystems through the use of syncoid
The first thing that surprised me in ZFS, was that everything is done with 2 commands: ZFS and zpool. This is quite different from lvm + mdadm + luks + ext4 + rsync.
OpenZFS documentation is great, and there are plenty of tutorials about zfs, I will not go far into the details, but am rather taking some notes ^^
Let's get on to it
So I start by migrating data on my main computer: around 16TB of data accumulated over more than 20 years!
First things first - learning zfs basics: devices, vpool, datasets - and creating desired partitions. Between the sorting doubles, cleaning names, removing corrupted files... I had a lot of fun.
I had the great idea of creating a few different datasets: one for private files, one for code, one for photos, music, videos, ...
After a couple of weeks fighting with dedup and detox, I had migrated my files successfully and they where cleaner than they have ever been.
Move forward a coupe of weeks, I am now reinstalling a secondary computer, 1000km from basecamp - when I realised that after Migration 1, I forgot to sync some work files.
I start migrating my files in a similar way, but decide to make a pause half way to study zfs a littlebit more in depth.
Here I discovered and played around a bit with zfs compression, encryption and snapshots.
That backup thingie
The thing is that at this point in time I was still quite pissed off - of not having synced my files... The thing is:
And at some point, at 4 am on a sunday night, I connected the dots... Maybe zfs could help me...
Change of perception
First thing in the morning next day, I started looking if I could not rent raw zfs space from a cloud provider. I found rsync.net - there might be others - I did not search more, I just wanted to try! So I recreated the volumes I wanted to share between computers, but this time I crypted them (and compressed some).
In this setup, zfs snapshots allows me to upload atomic changes on crypted volumes - without the need of mounting them on remote host!
This means I can finally securely store my filesystems on a cloud provider, without wasting bandwith!!! thank you zfs!!! I then discovered that syncoid was the perfect tool for my usecase (sanoid was just too much) - and wrote a little wrapper around syncoid (Note: read the readme.freebsd!!!).
This is a really awesome feature, the only little drawback is that is will have implications on my whole backup architecture!!! more on that in another post!
There is plethora of zfs guides and cheatsheets out there: they are very often more complete than this one, which is only a small cheatsheet of the commands I use most. Without furter due, here are a few useful commands for zfs:
The number indicates the number of redundancy disks
zpool create data raidz2 ada0 ada1 ada2 ada3 ada4
Make sure datasets in the zpool are unmounted before destroying!
zpool destroy data
if it fails, you could try to force the operation with
zpool destroy -f data
zfs create -o encryption=on -o compression=on -o keylocation=prompt -o keyformat=passphrase -o mountpoint=/home/user/data1 data/data1
And list the snapshots
zfs list -t snapshot
load keys before mounting
zfs load-key data/data1
unload keys after unmounting
zfs unload-key data/data1
zfs set mountpoint=/data/drive data/olddocs
zfs mount data/data1
Make sure you are not using the dataset (terminal, file explorer, open file)
zfs umount data/data1
If it does not work, you might try the force option
zfs umount -f data/data1
zfs destroy data/data1
If it does not work, you might try the force option
zfs destroy -f data/data1
zfs won't destroy a dataset with childs or snapshots, to force that use -r
zfs destroy -r -f data/data1
One of the features I really like about ZFS, is the possibility to store atomic snapshots on a host as filesystem - without the need for the host to know the key to the data.
I always avoided uploading important files to a cloud in an unencrypted way.
I also always hated having to upload a whold 2Gb crypted luks volume when all I did was changing 3K on it
With ZFS, I store the encrypted images on the cloud, and the backups only take 3K!!!
Now... one thing that was not especially easy, was to find a provider of "raw cloud" zfs space.
One solution I envisioned, was to install FreeBSD on a VPS, and then have some sync happening there. The monthly price/Gb was absolutely horrible.
I stumbled upon rsync.net, whose offer came a little bit more expensive in total, but with 1Tb instead of 80Gb! It's a go.
I can't really tell you yet if the service is reliable, but their support is both responsive and very helpful!
As a consequence of having 1Tb available on the cloud to store crypted volumes, I don't really need my google drive space anymore, and will probably end up closing my MEGA account too - so all in all I am paying less and gaining in privacy.
Of course, switching my main backup to ZFS also means that I will have to update my backup scripts accordingly. Currently, backups are done on 2 machines running debian 10 with 2*500Gb in lvm/raid/luks/ext4 - I sync them with unison but have a few issues... I'm not sure I want to push debian on ZFS, so I think I will format these machines and write new scripts for my backups.
I did a stupid little script to help me sync some filesystems between various computers... Check it on bitbucket!