| Title: My BTRFS cheatsheet | |
| Author: Solène | |
| Date: 29 August 2022 | |
| Tags: btrfs linux | |
| Description: | |
| # Introduction | |
| I recently switched my home "NAS" (single disk!) to BTRFS, it's a | |
| different ecosystem with many features and commands, so I had to write | |
| a bit about it to remember the various possibilities... | |
| BTRFS is an advanced file-system supported in Linux, it's somehow | |
| comparable to ZFS. | |
| # Layout | |
| A BTRFS file-system can be made of multiple disks and aggregated in | |
| mirror or "concatenated", it can be split into subvolumes which may | |
| have specific settings. | |
| Snapshots and quotas are applying on subvolumes, so it's important to | |
| think beforehand when creating BTRFS subvolumes, one may want to use a | |
| subvolume for /home and for /var for most cases. | |
| # Snapshots / Clones | |
| It's possible to take an instant snapshot of a subvolume, this can be | |
| used as a backup. Snapshots can be browsed like any other directory. | |
| They exist in two flavors: read-only and writable. ZFS users will | |
| recognize writable snapshots as "clones" and read-only as regular ZFS | |
| snapshots. | |
| Snapshots are an effective way to make a backup and rolling back | |
| changes in a second. | |
| # Send / Receive | |
| Raw filesystem can be sent / receive over network (or anything | |
| supporting a pipe) to allow incremental differences backup. This is a | |
| very effective way to do incremental backups without having to scan the | |
| entire file-system each time you run your backup. | |
| # Deduplication | |
| I covered deduplication with bees, but one can also use the program | |
| "duperemove" (works on XFS too!). They work a bit differently, but in | |
| the end they have the same purpose. Bees operates on the whole BTRFS | |
| file-system, duperemove operates on files, it's different use cases. | |
| duperemove GitHub project page | |
| Bees GitHub project page | |
| # Compression | |
| BTRFS supports on-the-fly compression per subvolume, meaning the | |
| content of each file is stored compressed, and decompressed on demand. | |
| Depending on the files, this can result in better performance because | |
| you would store less content on the disk, and it's less likely to be | |
| I/O bound, but also improve storage efficiency. This is really content | |
| dependent, you can't compress binary files like pictures/videos/music, | |
| but if you have a lot of text and sources files, you can achieve great | |
| ratios. | |
| From my experience, compression is always helpful for a regular user | |
| workload, and newer algorithm are smart enough to not compress binary | |
| data that wouldn't yield any benefit. | |
| There is a program named compsize that reports compression statistics | |
| for a file/directory. It's very handy to know if the compression is | |
| beneficial and to which extent. | |
| compsize GitHub project page | |
| # Defragmentation | |
| Fragmentation is a real thing and not specific to Windows, it matters a | |
| lot for mechanical hard drive but not really for SSDs. | |
| Fragmentation happens when you create files on your file-system, and | |
| delete them: this happens very often due to cache directories, updates | |
| and regular operations on a live file-system. | |
| When you delete a file, this creates a "hole" of free space, after some | |
| time, you may want to gather all these small parts of free space to | |
| have big chunks of free space, this matters for mechanical disks has | |
| the physical location of data is tied to the raw performance. The | |
| defragmentation process is just physically reorganizing data to order | |
| files chunks and free space into continuous blocks. | |
| Defragmentation can be used to force compression in a subvolume, like | |
| if you want to change the compression algorithm or enabled compression | |
| after saving the files. | |
| The command line is: btrfs filesystem defragment | |
| # Scrubbing | |
| The scrubbing feature is one of the most valuable feature provided by | |
| BTRFS and ZFS. Each file in these file-system is associated with its | |
| checksum in some metadata index, this mean you can actually check each | |
| file integrity by comparing its current content with the checksum known | |
| in the index. | |
| Scrubbing costs a lot of I/O and CPU because you need to compute the | |
| checksum of each file, but it's a guarantee for validating the stored | |
| data. In case of a corrupted file, if the file-system is composed of | |
| multiple disks (raid1 / raid5), it can be repaired from mirrored | |
| copies, it should work most of the time because such file corruption is | |
| often related to the drive itself, thus other drives shouldn't be | |
| affected. | |
| Scrubbing can be started / paused / resumed, this is handy if you need | |
| to operate heavy I/O and you don't want the scrubbing process to | |
| increase time. While the scrub commands can take a device or a path, | |
| the path parameter is only used to find the related file-system, it | |
| won't just scrub the files in that directory. | |
| The command line is: btrfs scrub | |
| # Rebalancing | |
| When you are aggregating multiple disks into one BTRFS file-system, | |
| files are written on a disk and some other files are written to the | |
| other, after a while, a disk may contain more data than the other. | |
| The rebalancing purpose is to redistribute data across the disks more | |
| evenly. | |
| # Swap file | |
| You can't create a swap file on a BTRFS disk without a tweak. You must | |
| create the file in a directory with the special attribute "no COW" | |
| using "chattr +C /tmp/some_directory", then you can move it anywhere as | |
| it will inherit the "no COW" flag. | |
| If you try to use a swap file with COW enabled on it, swapon will | |
| report a weird error, but you get more details in the dmesg output. | |
| # Converting | |
| It's possible to convert a ext2/3/4 file-system into BTRFS, obviously | |
| it must not be currently in use. The process can be rolled back until | |
| a certain point like defragmenting or rebalancing. |