Subj : FTN Packet File Structure?
To   : Jon Watson
From : mark lewis
Date : Fri Oct 15 2004 08:20 pm

JW> Wicked, thanks Sean. I've stumbled (with a little help from
JW> Mark) onto the FTS and FSC docs but I'm having trouble
JW> figuring out the message structure.

let's see what we can do about that ;)

JW> I understand that everything before the actualy message body
JW> is null terminated, and I can parse that out without a
JW> problem. What I think is odd, and therefore assume I'm
JW> misunderstanding, is that the message body appears to start
JW> with the AREA: keyword.

i'm assuming that you've gotten past the PKT header? if not, it is 48 bytes
long... then you have the first message header... packed message headers are 34
bytes long and then variable lengths for the To, From and Subj lines...

so, to read the packed message header, you pump out the first 34 bytes and then
read three times for the null terminated strings... after that, everything till
the terminating null is all message body... everything...

JW> What I've seen is that different messages have varying fields
JW> between the AREA: keyword and the beginning of the actual text
JW> of the message. This makes it pretty hard to separate the part
JW> of the message body I want to see, and the part I don't want
JW> to see.

and you can't go by the assumtion that all control lines are up at the top of
the message body right after the header block... why? because some control
lines are place within the message body...

the thing to do is to get past the header and then start reading everything up
to the terminating null character... stuff starting with the CTRL-A (0x01)
character and terminating with the CR or CRLF pair is the control lines that
you are wanting to throw away so do so while reading the message body for the
content...

when you get done skipping over the 0x01...CR|CRLF stuff, you should only have
the message body remaining... you'll also want to be filtering (in some cases)
the ALT-141 "" character as QWK and some bbs' use that as a line wrap
indicator... and then you have the seenby and path lines that you'll also
probably be throwing away...

it is a good idea to read those control lines and adapt the message body to the
proper character set by looking for and acting upon the CHRS and/or CODEPAGE
control lines to convert from them to the 8859-1 or whatever it is that is the
default in your message stuff...

JW> I *think* I've figured out (need a few more packets to
JW> confirm) that the fields that comprise the part I don't want
JW> to see are separated by a ^M^A token, whereas the message body
JW> starts with a single ^M.

can't rely on that... some stuff is CR (^M) and other stuff is CRLF (^M^J)...
the ^A stands on its own as the start of the control line indicator...

JW> Is this the way it's supposed to be? Seems kinda laisse faire
JW> (no, I can't spell that).

neither can it... hope the above helps... enjoy!

)\/(ark
* Origin: (1:3634/12)