* * * * *

                Sorting through the mess that is my filesystem

A few weeks ago in The Weekly Meeting, Smirk made an offhand comment about
the lack of organization in his email. He doesn't bother, finding it easier
to let the computer search through his copious amounts of email for email
he's interested in (and once a year, the accumated email gets dumped into the
archives). What struck me about the comment is the search bit. Google [1]
made a business around searching. First web pages, then email [2], then the
Google Toolbar [3], for searching your files locally.

That was an interesting concept. So I hacked together some code to fully
index all my files. Not the contents, no, that's a bit too much to handle.
No, what I indexed was information about the files—the names, sizes,
timestamps, file types, creation time, all the bits about a file.

It's amazing what I've found. I have 338,516 files (and that's not counting
the stuff making up the operating system—that's personal files I'm talking
about). The mean [4] file size is 104,654 bytes, but the median [5] size is
3,864, which to me indicates I have some huge files skewing the average. Said
338,516 files are stored in 26,750 directories (or “folders” for you Window
users out there). 55% of the files (215,000) are text files of some sort;
86,100 are images. And all these files and directories consume 45G
(Gigabytes) of disk space.

Okay, so maybe it's only interesting to me.

But I showed the program to Smirk and P today at The Weekly Meeting. Smirk
saw the value in the program (even as clunky as it stands right now) and
about an hour after the meeting, called me with a commerial application in
mind, based on this idea.

Not bad for something I hacked together on a whim.

[1] http://www.google.com/
[2] http://www.gmail.com/
[3] http://toolbar.google.com/
[4] http://en.wikipedia.org/wiki/Mean
[5] http://en.wikipedia.org/wiki/Median

Email author at [email protected]