In the process of creating tarballs of some old hard drives. Each of

In the process of creating tarballs of some old hard drives. Each of
them several hundred GB in size. I'd like to have a checksum of each
tarball. Naive approach:

tar cf "$i".tar "$i"
sha1sum "$i".tar >"$i".tar.sha1sum

What's bad about this is that it requires reading the data twice: Once
for creating the tarball, then we have to read the tarball again to
create the checksum file.

Reading 200 GB at 80 MB/s takes about 42 minutes. Ugh.

Can't we do that in one go? Yes, we can:

tar cf - "$i" | tee >(sha1sum | sed "s/ -\$/ $i.tar/" >"$i".tar.sha1sum) >"$i".tar

The call to `sed` is a bit ugly and it uses GNU Bash co-processes, but
it saves a lot of time.

Oh, damn it. Now I realized I also need an index file (a dump of `tar
tvf "$i"`). Could have done that by adding another co-process, but I've
already finished a few drives ...

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Those old drives are from former computers of mine. Some back from when
I still used Windows. File organization in Windows is really messy, your
stuff is scattered all around `C:` and `D:` and `E:` and what not.
Maybe it's different today, I don't know, but back then, Windows (or
rather, DOS) didn't really encourage you to organize your files in a
meaningful way. That's why I'm simply dumping the entire drives. I don't
know where my stuff is.

On UNIXoid systems, you at least have most of your stuff in `$HOME`. If
you do it right, *all* your stuff is in `$HOME`. (I screwed up a few
times and put important data somewhere else like `/var/www` ... Don't do
that on personal computers.)