On 2017-08-06, I wrote about that incident in our data center. I also
wrote, "fuck performance, I want data integrity".
I still think that way. However, my tool that stores checksums in
extended attributes in the file system, was a failure. Because it was
too slow.
The tool itself isn't slow. After you run it, you have a filesystem
where *every* file has extended attributes. That in itself is not a
problem, either. Here's the thing: *Other* tools might get slow
because of that. rsync, for example.
I create my backups using "rsync -zaXAP ...". That "X" copies extended
attributes and if you have a lot of them, rsync gets really, really
slow.
Isn't that a contradiction? "Fuck performance" vs. "that's too slow"?
It might seem that way, but remember that my tool was a dirty
workaround to begin with. It's not a clean filesystem design that
incorporates checksums. It's a hack. And I'm not willing to sacrifice
performance for a dirty hack.
There's an alternative: Store the checksums in regular files. Or in
one big regular file, like an sqlite database. I already have a tool
that does that (it's not published). Downside of this approach is that
you can't detect file renames, because the checksums are not directly
attached to the files. On the other hand, it allows you to scan for
files that have vanished ...
Of course, this is a workaround as well. What I want is a filesystem
that handles all this.
____________________
Oh, yes, and there's Git. A lot of my private data that I care about
is in Git. Git has checksums all over the place and that helps a lot.