zfs migration and cheatsheet

zfs migration and cheatsheet

Introduction

Whilst recently migrating my main computer from Linux to FreeBSD, I also started to Migrate my data. It is not the first time I migrate TB of data from a filesystem to another - however the capabilities of ZFS made me rethink how to manage it (my data).
The first thing that surprised me in ZFS, was that everything is done with 2 commands: ZFS and zpool. This is quite different from lvm + mdadm + luks + ext4 + rsync. Let's get on to it

Migration 1

So I start by migrating data on my main computer: around 16TB of data accumulated over more than 20 years!
First things first - learning zfs basics: devices, vpool, datasets - and creating desired partitions. Between the sorting doubles, cleaning names, removing corrupted files... I had a lot of fun.
I had the great idea of creating a few different datasets: one for private files, one for code, one for photos, music, videos, ...
After a couple of weeks fighting with dedup and detox, I had migrated my files successfully and they where cleaner than they have ever been.

Migration 2

Move forward a coupe of weeks, I am now reinstalling a secondary computer, 1000km from basecamp - when I realised that after Migration 1, I forgor to sync some work files.
I start migrating my files in a similar way, but decide to make a pause half way to study zfs a littlebit more in depth.
Here I discovered and played around a bit with zfs compression, encryption and snapshots.

That backup thingie

The thing is that at this point in time I was still quite pissed off - of not having synced my files... The thing is:

I don't really enjoy storing the in clear on a public cloud storage
I have uptime issues from both sides, so I can't rely on the other end being up when I need to sync
Putting everything in luks files is fine, but it often consumes a vast amount of bandwith just for including atomic changes

And at some point, at 4 am on a sunday night, I connected the dots... Maybe zfs could help me...

Change of perception

First thing in the morning next day, I started looking if I could not rent raw zfs space from a cloud provider. I found rsync.net - there might be others - I did not search more, I just wanted to try! So I recreated the volumes I wanted to share between computers, but this time I crypted them (and compressed some).
In this setup, zfs snapshots allows me to upload atomic changes on crypted volumes - without the need of mounting them on remote host!
This means I can finally securely store my filesystems on a cloud provider, without wasting bandwith!!! thank you zfs!!! I then discovered that syncoid was the perfect tool for my usecase (sanoid was just too much) - and wrote a little wrapper around syncoid (Note: read the readme.freebsd!!!).

This is a really awesome feature, the only little drawback is that is will have implications on my whole backup architecture!!! more on that in another post!

Cheatsheet

There is plethora of zfs guides and cheatsheets out there: they are very often more complete than this one, which is only a small cheatsheet of the commands I use most. Without furter due, here are a few useful commands for zfs:

zpool Commands

Create a raidz pool

The number indicates the number of redundancy disks

zpool create data raidz2 ada0 ada1 ada2 ada3 ada4

List zpools

zpool list

destroy a zpool

Make sure datasets in the zpool are unmounted before destroying!

zpool destroy data

if it fails, you could try to force the operation with

zpool destroy -f data

zfs commands

Create a compressed and encrypted dataset in pool:

zfs create -o encryption=on -o compression=on -o keylocation=prompt -o keyformat=passphrase -o mountpoint=/home/user/data1 data/data1

list datasets:

zfs list

And list the snapshots

zfs list -t snapshot

load/unload keys to mount encrypted dataset:

load keys before mounting

zfs load-key data/data1

unload keys after unmounting

zfs unload-key data/data1

mount a dataset:

zfs mount data/data1

unmount a dataset:

Make sure you are not using the dataset (terminal, file explorer, open file)

zfs umount data/data1

If it does not work, you might try the force option

zfs umount -f data/data1

prevent a dataset from automounting:

Very useful when you have an unencrypted dataset in a crypted dataset - else you need to zfs unmount && zfs remount the unencrypted dataset to see it once crypted dataset is mounted.

zfs set canmount=noauto data/data1

destroy dataset:

zfs destroy data/data1

If it does not work, you might try the force option

zfs destroy -f data/data1

zfs won't destroy a dataset with childs or snapshots, to force that use -r

zfs destroy -r -f data/data1

I've been testing with zfs for a few weeks and literally fell in love with what this fs allows me to do, especially: - compression - encryption - snapshots - easy atomic syncing of distant compressed, encrypted, filesystems through the use of syncoid

Snapshot Warning What works:
Do a snapshot - do the stupid test - if it fails,then restore the snapshot.

OpenZFS documentation is great, and there are plenty of tutorials about zfs, I will not go far into the details, but am rather taking some notes ^^ zfs list zfs list -t snapshot zpool list create a crypted and compressed partition: zfs create -o compression=on -o encryption=on -o keylocation=prompt -o keyformat=passphrase -o mountpoint=/home/jupiter/Audio data/Audiofil # itis mounted by default, but later use: to access it. zfs load-key zfs mount data/drive zfs unload-key zfs destroy -r data/drive Be carefull: be carefull, a destroyed dataset will be hard to recover! https://serverfault.com/questions/842955/restoring-data-after-zfs-destroy what works: do a snapshot - do the stupid test - if it fails,then restore the snapshot. what doesn't work: do a snapshot - work for a week - fo the stupid test - if it fails, then restore the snapshot This is a BAD idea, as everything you did since the snapshot will be deleted. zfs set mountpoint=/data/drive data/olddocs zfs create -o compression=on -o encryption=on -o keylocation=prompt -o keyformat=passphrase -o mountpoint=/home/jupiter/Documents data/newdocs cp -a /data/drive/ /home/jupiter/Documents zfs destroy data/olddocs

How ZFS changed my perception of backups and syncing

One of the features I really like about ZFS, is the possibility to store atomic snapshots on a host as filesystem - without the need for the host to know the key to the data.
I always avoided uploading important files to a cloud in an unencrypted way.
I also always hated having to upload a whold 2Gb crypted luks volume when all I did was changing 3K on it
With ZFS, I store the encrypted images on the cloud, and the backups only take 3K!!!

Now... one thing that was not especially easy, was to find a provider of "raw cloud" zfs space.
One solution I envisioned, was to install FreeBSD on a VPS, and then have some sync happening there. The monthly price/Gb was absolutely horrible.
I stumbled upon rsync.net, whose offer came a little bit more expensive in total, but with 1Tb instead of 80Gb! It's a go.

I can't really tell you yet if the service is reliable, but their support is both responsive and very helpful!
As a consequence of having 1Tb available on the cloud to store crypted volumes, I don't really need my google drive space anymore, and will probably end up closing my MEGA account too - so all in all I am paying less and gaining in privacy.
Of course, switching my main backup to ZFS also means that I will have to update my backup scripts accordingly. Currently, backups are done on 2 machines running debian 10 with 2*500Gb in lvm/raid/luks/ext4 - I sync them with unison but have a few issues... I'm not sure I want to push debian on ZFS, so I think I will format these machines and write new scripts for my backups.
I did a stupid little script to help me sync some filesystems between various computers... will post soon