On  2017-08-06, I wrote about that incident in our data center. I also
 wrote, "fuck performance, I want data integrity".

 I still think that way. However, my  tool  that  stores  checksums  in
 extended  attributes in the file system, was a failure. Because it was
 too slow.

 The tool itself isn't slow. After you run it, you  have  a  filesystem
 where  *every*  file  has extended attributes. That in itself is not a
 problem, either. Here's  the  thing:  *Other*  tools  might  get  slow
 because of that. rsync, for example.

 I create my backups using "rsync -zaXAP ...". That "X" copies extended
 attributes and if you have a lot of them, rsync  gets  really,  really
 slow.

 Isn't  that a contradiction? "Fuck performance" vs. "that's too slow"?
 It might seem that  way,  but  remember  that  my  tool  was  a  dirty
 workaround  to  begin  with.  It's  not a clean filesystem design that
 incorporates checksums. It's a hack. And I'm not willing to  sacrifice
 performance for a dirty hack.

 There's  an  alternative:  Store the checksums in regular files. Or in
 one big regular file, like an sqlite database. I already have  a  tool
 that does that (it's not published). Downside of this approach is that
 you can't detect file renames, because the checksums are not  directly
 attached  to  the  files. On the other hand, it allows you to scan for
 files that have vanished ...

 Of course, this is a workaround as well. What I want is  a  filesystem
 that handles all this.


                          ____________________


 Oh,  yes,  and there's Git. A lot of my private data that I care about
 is in Git. Git has checksums all over the place and that helps a lot.

 It does not help in the data center, though.