* * * * *

                              Notes on a program

This entry is less a polished entry and more just notes on a project I've
been given. If it seems somewhat random and hard to fathom, that's why.

Smirk is drowning in email. As such, he's looking for a filtering solution
whereby he can run a job daily that scans his email (using IMAP (Internet
Message Access Protocol)) and shuffles email to different folders. The
criteria is something like “I've read it and it's older than seven days, move
to this folder. If it's unread and older than three days, move to this other
folder. If I've read it and replied to it, move it to yet another folder.”

procmail [1] won't really handle that, as it's meant more for initial
delivery and filtering of email. He also rejected sieve [2] as it apparently
doesn't handle date parsing that well (or something like that). So he asked
me if I could write such a program, preferrably using PHP since he knows that
language (and since I equally hate Perl and PHP, it's six of one, half dozen
the other, and I would prefer C, but that's me).

So, the design of the program. Given some input file describing the filtering
to do on email:

> account imap://alice:[email protected]/
> {
>   mailbox INBOX
>   {
>     foreach message
>     {
>       if (header.subject =~ /[Vv][Ii1][Aa@][Gg][Rr][Aa]/)
>       moveto Trash;
>       if (status = REPLIED) moveto Replied;
>       if (header.date ~ "3 days ago" && status = UNREAD)
>         moveto Archive;
>       if (header.date ~ "7 days ago" && status != UNREAD)
>         moveto ReadArchive;
>     }
>   }
>
>   mailbox Archive
>   {
>     if (messages > 5000)
>       sendmail("Yo!  There are too many messages in the archive!");
>
>     if ((messages > 3000)
>     || (message[1].header.date >~ "6 months ago"))
>       sendmail("Yo!  Check your archive!");
>   }
> }
>

Okay, maybe nothing quite so grandiose, but some file to explain the rule
sets for moving messages from one box to another, run as a job periodically
(a cron job).

Obligatory PHP Documentation Links

Since Smirk wants this in PHP

* IMAP, POP (Post Office Protocol)3, NNTP (Network News Transfer Protocol)
 Functions [3]—requires external library with some funky installation
 requirements
* Regular Expression Functions (Perl-Compatible) [4]
* mailparse Functions [5]—considered experimental and is no longer bundled
 with PHP
* iconv Functions [6]


We need to retrieve information via IMAP. We need to parse the email headers.
We'll need regular expressions, as well as date processing utilities (“3 days
ago,” “less than 5 hours,” etc). We'll need to read and parse the rules file
(using whatever syntax I come up with). Oh, I would like to translate all the
text to some intermediary character set so we can filter consistently [7],
which means using iconv (and parsing MIME (Multipurpose Internet Mail
Extensions) specific headers and MIME-encoded headers).

So the main program flow for processing each message would look something
like:

> get headers for next message
> convert to consistent character set (probably UTF (Unicode Transformation Format)-8)
> for each rule to check again
>       check conditions of rule against message
>       if all conditions apply, apply action
>

The hardest parts appear to be getting a version of PHP with all the required
exentions installed. Next would be defining the input file and parsing that
into some internal format for processing. The rest pretty much just falls
into place.

Most of the time will be spent in building the required version of PHP, and
in playing with the various modules to figure out how they work and what
exactly one gets. I would also need to set up a play IMAP account to test the
program against (there's no way I want to run this on my email account, or on
Smirk's for that matter).

[1] http://www.procmail.org/
[2] http://www.nada.kth.se/datorer/e-post/sieve-at-nada.shtml
[3] http://us3.php.net/manual/en/print/ref.imap.php
[4] http://us3.php.net/manual/en/print/ref.pcre.php
[5] http://us3.php.net/manual/en/print/ref.mailparse.php
[6] http://us3.php.net/manual/en/print/ref.iconv.php
[7] gopher://gopher.conman.org/0Phlog:2005/04/27.2

Email author at [email protected]