Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!news.rediris.es!irazu.switch.ch!switch.ch!in.100proofnews.com!in.100proofnews.com!cox.net!news-xfer.cox.net!199.224.117.12.MISMATCH!news2.epix.net!news1.epix.net!dclunie.com
Newsgroups: alt.image.medical,comp.protocols.dicom,sci.data.formats,alt.answers,comp.answers,sci.answers,news.answers
Message-ID: <[email protected]>
Expires: 21 Jan 2004 00:00:00 GMT
Subject: Medical Image Format FAQ, Part 6/8
From: [email protected] (David A. Clunie)
Followup-To: alt.image.medical
Reply-To: [email protected] (David A. Clunie)
Approved: [email protected]
Summary: This posting contains answers to the most Frequently Asked
        Question on alt.image.medical - how do I convert from image
        format X from vendor Y to something I can use ? In addition
        it contains information about various standard formats.
Lines: 811
Date: Sun, 21 Dec 2003 14:16:50 GMT
NNTP-Posting-Host: 216.37.230.197
X-Complaints-To: [email protected]
X-Trace: news1.epix.net 1072016210 216.37.230.197 (Sun, 21 Dec 2003 09:16:50 EST)
NNTP-Posting-Date: Sun, 21 Dec 2003 09:16:50 EST
Xref: senator-bedfellow.mit.edu alt.image.medical:12459 comp.protocols.dicom:11715 sci.data.formats:3064 alt.answers:70773 comp.answers:55778 sci.answers:15699 news.answers:263504

Archive-name: medical-image-faq/part6
Posting-Frequency: monthly
Last-modified: Sun Dec 21 09:16:50 EST 2003
Version: 4.26

4.  Host Machines

   4.1 Data General

       4.1.1 Data General Data

             4.1.1.1 Data General Integers

                     Integers are 16 bit two's complement and stored in
                     big-endian format as on Sun Sparc and opposite to the Dec
                     VAX.

             4.1.1.2 Data General Floating Point

                     Single precision real values are 32 bits long, in
                     big-endian format.  The high bit is the sign bit, followed
                     by a 7 bit excess 64 exponent (power to which 16 must be
                     raised) then a 24 bit hexadecimally normalized mantissa
                     with the decimal point to the left of the most significant
                     bit.  Double precision values just have another 32 bits
                     tacked on the mantissa and the same exponent format.


           Sign
          |<-->|<------ Exponent ------>|<--------- Mantissa -------->|
           ______________ ______________ ______________ ______________
          | | | | |
          |______________|______________|______________|______________|
           31 28 27 24 23 20 19 16
          |<----------------------- Mantissa ------------------------>|
           ______________ ______________ ______________ ______________
          | | | | |
          |______________|______________|______________|______________|
           15 12 11 8 7 4 3 0



                     Here is a little piece of C++ code that should run on
                     anything and convert Data General floats to whatever the
                     host's floating point format is.


               double value; unsigned char sign; Uint16 exponent; Uint32
               mantissa;

               typedef struct {
                       unsigned sign : 1; unsigned exponent : 7; unsigned
                       mantissa : 24;
               } DG_FLOAT;

               DG_FLOAT number;

               unsigned char buffer[4]; instream.read(buffer,4); if (instream)
               {
                       // DataGeneral is a Big Endian machine memcpy ((char
                       *)(&number),buffer,4); sign = number.sign; exponent =
                       number.exponent; mantissa = number.mantissa;

                       value = (double) mantissa / (1 << 24) *
                               pow (16.0, (long)(exponent) - 64);
                       value = (sign == 0) ?  value : -value;
               } else {
                       cerr << "read failed\n" << flush; value=0;
               }

       4.1.2 Data General Operating System

             4.1.2.1 Data General RDOS

                     Used on the GE CT 9800 family.  Severely primitive but
                     then is running on an old machine that can only map 64Kb
                     of memory at a time after all.  It is apparently
                     multitasking.  Documentation may still be available from
                     Data General (try DG Direct) but is not supplied with the
                     scanner by GE.  If anyone knows where I can find it at a
                     reasonable price let me know.  Here is a brief command
                     summary culled from a nifty pocket book from GE for
                     SunOS/Genesis users that compares commands:


                CHATR - file attributes CRAND - create randomly organized file
                CDIR - create directory DELETE - files or directories DIR -
                change directory DISK - free space FILCOM - compare files GDIR
                - show working directory name GTOD - show date and time LINK -
                files (symbolic) LIST - directory contents MOVE - a file RENAME
                - a file SDAT - set date STOD - set time SDUMP - write files to
                a device SLOAD - read dumped files SPEED - tex editor TYPE -
                contents of file XFER - copy a file

                wildcards: '-' is series, '*' is single character

             4.1.2.2 Data General AOS/VS

                     Used on the GE Signa 3X and 4X family.  Quite a nice
                     operating system with multi-tasking and hierarchical
                     directories.  Here is a brief command summary again culled
                     from a nifty pocket book from GE for SunOS/Genesis users
                     that compares commands:


                ACL - access control list (ownership) BYE - exit command
                process COPY - a file CREATE - a text file CREATE/DIR - a
                directory CREATE/LINK - link files DELETE - files & directories
                DIR - display or change working directory DUMP - to peripheral
                F/AS/S - directory listing with file status DATE - show or set
                HELP LOAD - DUMPed files MOVE - a file RENAME - a file PATH -
                show pathname of a file PAUSE - the command line interpreter
                SUPERU ON - enable superuser SED - text editor TIME - show or
                set TYPE - contents of text file ?  - list processes running

                wildcards: '+' is series, '*' is single character


                     Other useful hints include the use of "^" to refer to the
                     next directory up (like ".." in Unix) in DIR commands.
                     Command options follow the command name without any spaces
                     and are indicated by a slash.  COPY operations specify the
                     destination name first and then the source name.  Devices
                     like the mag tape are indicated by "@", for example
                     "@MTB0" is tape drive zero.  Files on the tape can be
                     referred to as "@MTB0:nn" which is very handy.  For
                     example to read a file off a CT 9800 tape under AOS/VS:


               COPY/V/IMTRSIZE=8192 B038040101.YP @MTB0:18


                     Perhaps most importantly, there is an extensive online
                     help system ...  use the HELP command.

       4.1.3 Data General Network

             If you have a GE Signa based on a DG then you can get the
             so-called "High Speed Network" card and software from GE.  From
             memory it is pretty pricey, and there used to be a "slower"
             network interface that was cheaper, but I don't think this is
             available anymore.


             If you have a CT 9800 based on the DG S/140 and you need to get it
             connected there are a number of solutions:


             - Talk to GE about there ID/LINK II product ...  I gather this is
             a device that hooks into the SCSI cable to the hard drive (you
             need one of the Ace drives not the old Zebra drive), monitors disk
             activity and snatches pieces of the conversation to make a copy of
             the image data, stores it and makes it available via some network
             protocol.  Sound crazy ?  Perhaps, but they tell me it works and
             the price is reasonable, at least for something from GE anyway.
             Get them to throw one in next time you buy something big.

             - The do-it-yourself approach.  Talk to John Clayton
             ([email protected]) at Claflin and Clayton.  They supply a complete
             R-level solution by providing Ethernet hardware and TCP/IP
             software for 16 bit DG OS including AOS and RDOS (specifically
             including the GE CT version of RDOS).  He tells me "you can expect
             a file transfer rate of 25 kbytes/s for S/140 systems".  The
             package consists of:


             $2,850 - EC-10 ethernet controller $1,645 - RDOS TCP/IP software
             (telnet client,ftp client/server)


             I have not personally tried either of these approaches, and I am
             sure there are others (talk to Merge or DeJarnette), but I am
             getting really tired of carrying 9-track tapes around so perhaps I
             will bite the bullet soon (and upgrade to a HighSpeed Advantage
             !).

   4.2 Vax

       4.2.1 Vax Data

             4.2.1.1 Vax Integers

                     - little endian - 8, 16, or 32 bits

             4.2.1.2 Vax Floating Point

                     - little endian

                     - D_float

                       - 8 bytes - sign bit 15 - exponent bits 14-7 excess 128
                       binary - fraction MSB firstbits 6-0, 31-16, 47-32, 63-48
                       - normalized bit is not represented (hidden)


                     - G_float

                       - 8 bytes - like D, but - exponent is bits 14-4 excess
                       1024 - fraction 3-0 and 63-16


                     - F_float

                       - 4 bytes - like D, smaller fraction


                     - H_float

                       - 16 bytes - like G, but - exponent is bits 14-0 excess
                       16384 - fraction is bits 127-16

                         - same wierd order - bit 112 least significant



             4.2.1.3 Vax Strings

                     - 16 bits of length - byte of type - byte of class - 32
                     bits of pointer

       4.2.2 Vax Operating System

             4.2.2.1 Vax VMS (See also Vax VMS Tools)

                     Truely one of the world's most irritating operating
                     systems to use, especially if you are a unix fan.  Still
                     it works, has a great online help system that saves one's
                     butt almost often enough to be useful, and if you can
                     remember the directory where kermit is stored and the
                     weird command to invoke it one can get by (barely).


                     If you don't know VMS and the vendor doesn't supply the
                     manuals, get them from DEC ...  you need them bad ...
                     real bad.  If (like me) you throw them out everytime you
                     move then encounter another piece of archaic equipment,
                     you need the "vaxbook" which is available via ftp from
                     decoy.uoregon.edu, written by Joseph E St Sauver, which
                     summarizes commands, files and all sorts of application
                     specific stuff, though it is no substitute for the real
                     thing.


                     Recent VMS update: goddamn file formats !  Why can't VMS
                     behave like a real operating system and forget this file
                     format crap !  I have some Philips S5 MR images exported
                     in ACR/NEMA format and I can't get the things off the
                     hosts's Vax using Kermit, because though they have fixed
                     length 512 byte records, some cretinous program sets the
                     "carriage return carriage control" record attributes,
                     which causes kermit to send with all the '0A' characters
                     scrubbed out amongst other atrocities.


                     I am getting desperate and about to try using the
                     Hex/Dehex utility that came with Kermit to get the stuff
                     off and then decode the hex format !  Or perhaps even use
                     "dump" to make a textfile, transfer, and decipher that.
                     (No I don't have a C compiler for the Vax so I guess I
                     can't use uuencode unless someone wants to mail me a
                     hex'ed executable).  Any hints, or instructions as to how
                     to use FDL and Convert, to change it to a normal format
                     would be appreciated.  (Why can't they just have a "set
                     file record attribute xxx" command like all the other
                     millions of set commands ?  Grrrr.).


                     More recent VMS update: finally had an inspiration while
                     staring at hex dumps of these files - why not use the VMS
                     "DUMP" utility which produces hex dumps as a "poor man's
                     uuencode" by saving the dump to a file, transferring it as
                     an ascii file, and then decoding it at the destination ?
                     Of course there are no nifty line checksums or anything,
                     but a transfer protocol such as kermit takes care of this.


                     The DUMP output defaults to 8 32 bit long words separated
                     by a space per line displayed as hex, then an ascii string
                     (32 bytes) and then a 24 bit word hex address offset from
                     the start of the fixed length record.  All the data
                     containing lines start with a single space, where as
                     descriptions at the start of each record begin in the
                     first column, hence the data lines can be easily selected
                     out.  By the way, the hex version of the data is listed in
                     reverse order !  VMS is so bizarre !  For example, here is
                     a fixed length 512 byte record file from a Philips S5 MRI
                     (some of the hex words elided to make the line fit on the
                     page):


Dump of file SYS$SYSROOT:[GYROSCAN]ABAALKHAIL02010201010001.ANI;1 ...  File ID
(2419,301,0) End of file block 198 / Allocated 200

Virtual block number 1 (00000001), 512 (0200) bytes

0000000C 00100008 ...  00000008 ..............................  000000 00083932
2E36302E ...  2D524341 ACR-NEMA 1.0..  .....1994.06.29..  000020 00600008
4D5F4553 ...  00000030 [email protected]_M..`.  000040 494B0000
00100080 ...  00000002 ....MR..p.....Philips ........KI 000060

00183148 00000002 ...  32200000 ..  2........63865375........H1..  0001E0
^L Dump of file SYS$SYSROOT:[GYROSCAN]ABAALKHAIL02010201010001.ANI;1 ...  File
ID (2419,301,0) End of file block 198 / Allocated 200

Virtual block number 2 (00000002), 512 (0200) bytes

40000018 45424F52 ...  00161250 P.....AGACQ_PT_SURFACE_PROBE...@ 000000


                     And so on ...  you get the idea.  This ugly little C++
                     utility written quickly during this moment of inspiration
                     will take saved DUMP output and make it binary again:


#include <fstream.h>

#include "MainCmd.h"

signed char hextobin(char c) {
       signed char r; switch (c) {
               case '0': r=0; break; case '1': r=1; break; case '2': r=2;
               break; case '3': r=3; break; case '4': r=4; break; case '5':
               r=5; break; case '6': r=6; break; case '7': r=7; break; case
               '8': r=8; break; case '9': r=9; break; case 'A': case 'a':
               r=0xa; break; case 'B': case 'b': r=0xb; break; case 'C': case
               'c': r=0xc; break; case 'D': case 'd': r=0xd; break; case 'E':
               case 'e': r=0xe; break; case 'F': case 'f': r=0xf; break;
               default: r=-1; break;
       } return r;
}

int main(int argc,char **argv) {
       CCOMMAND(argc,argv);

       while (1) {
               const linemax=132; // only needs 113 char line[linemax];
               cin.getline(line,linemax); if (!cin || cin.eof()) {
                       // cerr << "Bad or eof\n" << flush; break;
               } unsigned count=cin.gcount(); if (count == 0 || line[0] != ' ')
               continue; if (count != 113) {
                       cerr << "Line length " << count << "\n" << flush; break;
               } unsigned i; char *ptr = line + 8*(1+8); // line is in reverse
               order ...  for (i=0; i<8; ++i) {
                       unsigned j; for (j=0; j<4; ++j) {
                               // 2 hex bytes -> 1 byte char bytelo = *--ptr;
                               char bytehi = *--ptr; unsigned char byte
                                       = (hextobin(bytehi)<<4)
                                         + hextobin(bytelo);
                               cout.put(byte);
                       } --ptr; // space between long words
               }
       } return 0;
}


                     Note that the nature of fixed length records under VMS
                     means that the last record will be padded out to 512 bytes
                     without any indication of the "real" end-of-file.  This
                     means you have to cope with trailing garbage gracefully.


                     Hot VMS/Philips news: [email protected] (Peter
                     Neelin) tells me there is an extremely useful tool for
                     fiddling binary files called FILE from DECUS.  It allows
                     you to change a file's header information without
                     modifying the content of the file.  This then permits ftp,
                     kermit, etc.  to do the right thing with Philips .ANI
                     files.  It also permits wildcards and does not make a copy
                     of the file (so it is fast).  He says also that someone
                     has told him that they succeeded in using convert to fix
                     these files, but his general experience with it is not
                     positive (it will often change the content of the file and
                     it doesn't allow wildcards, in addition to promoting the
                     use of the horrible fdl editor!).  If you are interested,
                     you can get FILE through gopher from decus.org (look for
                     the DECUS software library archives, under essential
                     tools).  The binary is provided in case you don't have a
                     compiler.  FILE, and many other useful things are also
                     available from the sites listed in Vax VMS Tools.


                     Some other useful hints:

                     - To log onto a serial terminal without executing the
                     login command file add "/NOCOM" to the username ...  this
                     way you can use the operator console login which often
                     won't require a password.

                     - There is a kermit available for the Vax under VMS (file
                     prefix "vms" in area or tape b) ...  I use the "obsolete"
                     version written in Bliss, because it comes from the
                     archives at columbia with a hex encoded executable which
                     can be uploaded just using an ordinary text capture into a
                     file, and doing the same with the short Macro hex program
                     that can then be assembled and used to make the convert
                     into the real executable.  Look in places like [SYSEXE]
                     first though to be sure Kermit is not already there.  The
                     generic C version of kermit runs under VMS (file prefix
                     "ck" in area or tape f), but not every imaging machine
                     comes with a VMS C compiler, whereas Macro is always
                     supposed to be there I gather.  There is however also a
                     hex encoded executable of the C version in the archives
                     (ckvker.hex) which I haven't tried, and is the one that is
                     recommended in the kermit documentation.

                     - There is apparently a zmodem for VMS but I don't know
                     where it comes from or how to get it.

                     - Serial ports are almost always defaulted to 9600 baud.

                     - "SET TERMINAL/ECHO" often isn't set.

                     - Vax/VMS ftp conventions:


                       UNIX FTP server Vax/VMS FTP server

                       cd dir cd [.dir] cd dir/subdir cd [.dir.subdir] cd ..
                       cd [-]

             4.2.2.2 ULTRIX 4.2.2.3 OSF

   4.3 Sun - Sun3 68000 and Sun4 Sparc

       4.3.1 Sun Data

             The sun3 and sun4 architectures use much the same formats.  Even
             though the processors are different both are big-endian and the
             float formats are IEEE.  See the Sparc Architecture Manual -
             Chapter 3 - Data Formats for more details.


             One very important difference though, is that the sun3 convention
             is not to align 32 bit and 64 bit data types on 4 and 8 byte
             boundaries respectively, whereas the sparc (sun4) architectures
             usually does, dictated by a compile time option.  Be very careful
             when using the same header files on one architecture or the other.
             This drove me nuts when trying to figure out why the well
             described Genesis (sun3) layout did not match the unknown
             Advantage Windows (sun4) data.  It was pretty obvious when it was
             pointed out though :).

             4.3.1.1 Sun Integers

                     Integers are 8, 16, 32, or 64 bit unsigned or signed two's
                     complement and stored in big-endian format as on Data
                     General and opposite to the Dec VAX.  Most C compilers
                     treat short as 16 bits, and int and long as 32 bits.

             4.3.1.2 Sun Floating Point

                     Formats conform to the IEEE 754-1985 Standard for Binary
                     Floating-Point Arithmetic.  Single precision real values
                     are 32 bits long, in big-endian format.  The high bit is
                     the sign bit, followed by a 8 bit excess 127 exponent
                     (power to which 2 must be raised) then a 23 bit normalized
                     mantissa with the decimal point to the left of the most
                     significant bit, from which 1.0 has been subtracted.
                     Double precision values have a 11 bit excess 1023 exponent
                     and a 52 bit mantissa.  Quad precision values have a 15
                     bit excess 16383 exponent and a 112 bit mantissa.


           Sign
          |<-->|<-------- Exponent -------->|<------- Mantissa ------>|
           ______________ ______________ ______________ ______________
          | | | | |
          |______________|______________|______________|______________|
           31 28 27 24 23 20 19 16
          |<----------------------- Mantissa ------------------------>|
           ______________ ______________ ______________ ______________
          | | | | |
          |______________|______________|______________|______________|
           15 12 11 8 7 4 3 0



                     Here is a little piece of C++ code that should run on
                     anything and convert Sun IEEE floats to whatever the
                     host's floating point format is.  It probably should take
                     into account a few special cases to be strictly correct:


               unsigned char buffer[4]; instream.read(buffer,4); if (instream)
               {
#ifdef USESUN4NATIVEFLOAT
                       float fvalue; memcpy ((char *)(&fvalue),buffer,4);
                       value=fvalue;
#else USESUN4NATIVEFLOAT
                       unsigned char sign; Uint16 exponent; Uint32 mantissa;

                       typedef struct {
                               unsigned sign : 1; unsigned exponent : 8;
                               unsigned mantissa : 23;
                       } IEEE_FLOAT_SINGLE;

                       IEEE_FLOAT_SINGLE number; // Sparc is a Big Endian
                       machine memcpy ((char *)(&number),buffer,4); sign =
                       number.sign; exponent = number.exponent; mantissa =
                       number.mantissa;

                       if (exponent) {
                               value = (1.0 + (double)mantissa / (1 << 23)) *
                                       pow (2.0, (long)(exponent) - 127);
                       } else {
                               if (mantissa) {
                                       value = (double)mantissa / (1 << 23) *
                                               pow (2.0, (long)(-126));
                               } else {
                                       value=0;
                               }
                       } value = (sign == 0) ?  value : -value;
#endif USESUN4NATIVEFLOAT
               } else {
                       cerr << "read failed\n" << flush; value=0;
               }

             4.3.1.3 Sun Strings

                     Strings obey the usual C convention of null terminated
                     strings without a length preamble.


       4.3.2 Sun Operating System

5.  Compression Schemes

   5.1 Reversible Compression 5.2 Irreversible Compression
       5.2.1 Perimeter Encoding
   5.3 DICOM Compression

       In DICOM, compression (both reversible and irreversible) is achieved by
               specifying a particular "transfer syntax" either during
               negotiation of the network connection (association) or in the
               media application profile for files stored on media (and
               specified in the meta information header so the reader knows
               which transfer syntax to switch to).


       The compressed data stream is actually encoded as an "encapsulated" data
               stream as defined in Part 5 of DICOM.  Uncompressed data
               (unencapsulated) is sent in DICOM as a series of raw bytes or
               words (little or big endian) in the Value field of the Pixel
               Data element (7FE0,0010).  Encapsulated data on the other hand
               is sent not as raw bytes or words but as Fragments contained in
               Items that are the Value field of Pixel Data.  The encoding of
               these Items follows the same pattern as is used to specify
               Sequences in DICOM, thogh the VR (Value Representation) field of
               the Pixel Data is OB not SQ.


       The encapsulated compressed data may be a single frame or it may contain
               multiple frames for those SOP Classes that allow multifram
               images (such as XA, XRF, US and NM).  The rules in part 5
               further specify that the first Item will either be empty or
               contain a list of offsets to the beginning of the Item
               containing each frame (or the only frame for a single frame
               image).  Also, though a frame may be split into multiple
               fragments, each fragment may contain data for only one frame.
               That is a frame may be split into multiple fragments, but a
               fragment may not span different frames.  The reason for the
               fragments in the first place is that each fragment (each item)
               must have a fixed, known length, so unless one buffers the
               entire compressed frame before encoding it, one doesn't know in
               advance how long it will be.  In practice, most encoders do send
               one frame per fragment but all decoders must be prepared to
               handle the case where a frame spans fragments.  Furthermore, all
               fragments have to be of even length, and there are padding rules
               in Part 5 for the last fragment of a frame (that are consistent
               with the definition of padding in the JPEG standard).


       Part 5 contains several examples of how to fill in the various fields in
               Items of the encapsulated sequence-like value for Pixel Data, so
               these will not be repeated here.  However the overall strategy
               looks something like this for an image with two frames,the first
               split across two fragments, and an empty offset table:


               (7FE0,0010) VR=OB VL=FFFFFFFF Pixel Data (FFFE,E000) VR=
               VL=00000000 Item (empty offset table, hence zero length)
               (FFFE,E000) VR= VL=000004C6 Item (first fragment of first frame)
               ....  compressed byte stream here (4C6 bytes) (FFFE,E000) VR=
               VL=0000024A Item (first fragment of first frame) ....
               compressed byte stream here (24A bytes) (FFFE,E000) VR=
               VL=00000628 Item (first fragment of first frame) ....
               compressed byte stream here (628 bytes) (FFFE,E0DD) VR=
               VL=00000000 Sequence Delimiter


               Note that the Item and Sequence Delimiter tags have no VR, that
               the Item Delimiter tag is never used, since Items are required
               to be of fixed not undefined length, and that the Sequence
               Delimiter tag is always used, since the Pixel Data is always of
               undefined length (that is FFFFFFFF) for encapsulated data.


               If one is trying to decode a DICOM image encoded with an
               encapsulated transfer syntax, one therefore has to get to the
               Pixel Data tag, and start parsing the sequence like structure.
               One cannot just pass the entire Value field of Pixel Data to a
               conventional JPEG decoder for instance.  One needs to strip out
               the embedded Item tags and the trailing Sequence Delimiter.  For
               an example of how to do this see the source code from
               dicom3tools in "libsrc/include/pixeldat/unencap.h", a simplified
               version of which (without the GE bug handling) is reproduced
               here.


       size_t read(void)
               {
                       // - non-pixel data is always LE, including fragment
                       delimiters and lengths // - 1st item is offset table,
                       may have zero VL // - other items are fragments // -
                       finally sequence delimitation tag (with zero VL) // -
                       each delimiter is 2 byte group,2 byte element, 4 byte
                       VL, little endian // - Item tag is (0xfffe,0xe000) // -
                       Seq delimiter is (0xfffe,0xe0dd)

                       length=0;

                       while (!lefttoreadthisfragment && !finished && !bad) {
                               Uint16 group=read16(); Uint16 element=read16();
                               Uint32 vl=read32(); if (group == 0xfffe) {
                                       if (element == 0xe0dd) { // Sequence
                                       Delimiter Tag
                                               Assert(vl == 0); finished=true;
                                       } else /* if (element == 0xe000) */ { //
                                       Item Tag
                                               bool vlbyteorderwrong=false; if
                                               (++fragmentnumber > 0) {
                                                       Assert(vl); // Zero
                                                       length fragments thought
                                                       not to be legal
                                                       lefttoreadthisfragment=vl;
                                               } else {
                                                       // skip the offset table
                                                       Assert(vl%4 == 0);
                                                       unsigned i=0; while (vl)
                                                       {
                                                               Uint32
                                                               offset=read32();
                                                               vl-=4; ++i;
                                                       }
                                               }
                                       }
                               } else {
                                       // bad tag group in encapsulated data
                                       bad=true;
                               }
                       }

                       if (lefttoreadthisfragment && !bad) {
                               length=unsigned(lefttoreadthisfragment >
                               maxlength ?  maxlength :
                               lefttoreadthisfragment); if
                               (istr->read(buffer,length)) {
                                       length=istr->gcount();
                               } else {
                                       bad=true; length=0;
                               } lefttoreadthisfragment-=length;
                       }

                       return length;
               }


               An application that will take a DICOM dataset and write a pure
               byte stream (having stripped off the DICOM encapsulation) is
               also in dicom3tools, "dctoraw".  One can feed the output of this
               utility straight to a JPEG decoder such as the Stanford PVRG
               utility "jpeg -d".  If any padding is present at the end of each
               frame, it should have been encoded in a manner consistent with
               JPEG padding defined in ISO 10918-1 so that the JPEG decoder
               won't fail if it encounters padding between the image frames.


               Note also that the use of the terms "image" and "frame" are
               slightly different in DICOM than JPEG so be careful when
               comparing the two standards.


               When using images with more than one component (that is a color
               image rather than a grayscale image), take care about the color
               space.  One of the features of the ISO 10918-1 JPEG standard is
               that it specifies only a compressed bitstream, and not a file
               format.  Even if there are three components specified in the
               compressed bitstream, that does not mean they are RGB or YBR or
               whatever.  This has to be signalled outside the bitstream, and
               in DICOM this is done in Photometric Interpretation (this is
               somewhat controversial however, and one should look at recent
               proposed DICOM CPs on the matter, such as CP 143).


               In the non-DICOM world, the color space is specified in the file
               header such as the commonly used JFIF header, or its superset,
               the SPIFF header as defined in ISO 10918-3.  Be especially
               careful that one does not assume during decoding that a JFIF
               header is present in the DICOM compressed bit stream ...  it is
               not.  If one wants to feed the extracted bitstream to a JPEG
               decoder that needs a JFIF header (like the IJG code), then you
               need to add one.  Conversely, never create an encapsulated DICOM
               image with a bitstream that contains the JFIF header ...  strip
               it off first or use an encoder like Stanford PVRG JPEG that
               doesn't create JFIF headers.


               Here JPEG has been discussed, but the same principle applies to
               other encapsulated data sets in DICOM, including the RLE
               compression scheme popular in Ultrasound images (which is
               equivalent to the TIFF PackBits compression scheme).  The
               compression scheme to interpret the encapsulated bitstream is
               different, but the encapsulation mechanism using Item tags and
               fragments is identical.


               This mechanism has been widely used in the cardiac angiography
               world on the DICOM CDs that these devices make, on Ultrasound 90
               mm MODs, and on GE's more recent CT and MR scanners that write
               use the CT and MR media application profile on 130 mm MODs.
               Note that early implementations of the encapsulation mechanism
               and the JPEG lossless encoding contain some bugs which are
               described in detail in the section on GE CTI.

6.  Getting Connected

   6.1 Tapes

       Nine-track half-inch tapes were the old medium of choice for archiving
       and image exchange and many older pieces of equipment will have these.
       Unfortunately most people don't have such a drive on their workstation
       or personal computer.  There are several possibilities:


         - Use another piece of equipment that has a more modern or
          networked or serial-ported host and a nine-track drive, and use it to
          do the extraction.  I used to use a networked Signa 4X to do this to
          extract GE 9800 CT tapes.

         - Visit your MIS department, which almost certainly has an archaic
          mainframe with a tape drive.  Sometimes tough to get them to read
          formats they aren't expecting though (the hosts not the people I mean
          :) ).

         - Buy a nine-track for your workstation.  This may seem a ridiculous
          idea given the price of new 6250 bpi drives are around $5,000, but
          one can often pick up bargain primitive non-6250 or refurbished drive
          that is adequate for the job.


       The Qualstar 1054 is one such drive, that attaches to a SCSI port, and
       works with the regular SunOS SCSI tape driver, once a few tables in the
       kernel have been updated as follows, and the kernel rebuilt:


{root}% pwd /usr/kvm/sys/scsi/targets

{root}% diff -c stdef.h.prequalstar stdef.h *** stdef.h.prequalstar Tue Aug 30
19:32:24 1994 --- stdef.h Tue Aug 30 19:32:24 1994 *************** *** 43,48
**** --- 43,49 ----
 #define ST_TYPE_FUJI 0x21 /* Fujitsu - (not tested) */ #define ST_TYPE_KENNEDY
 0x22 /* Kennedy */ #define ST_TYPE_HP 0x23 /* HP */
+ #define ST_TYPE_QUALSTAR 0x24 /* Qualstar */
 #define ST_TYPE_HIC 0x26 /* Generic 1/2" Cartridge */ #define ST_TYPE_REEL
 0x27 /* Generic 1/2" Reel Tape */

{root}% diff -c st_conf.c.prequalstar st_conf.c *** st_conf.c.prequalstar Tue
Aug 30 19:32:22 1994 --- st_conf.c Tue Aug 30 19:32:22 1994 *************** ***
153,158 **** --- 153,174 ----
  * so our best guess as to their capabilities is * included herein.  */
+ /* Qualstar 1054 or 1260s scsi 9-track with 64KB buffer */ + { + "Qualstar
1054/1260s 1/2\" Reel", 7, "NCR ADP-53", ST_TYPE_QUALSTAR, 10240, + (ST_REEL |
ST_VARIABLE | ST_BSF | ST_BSR), + 300, 300, + { 0x00, 0x02, 0x06, 0x03}, + { 0,
0, 0, 0 } + }, + /* Qualstar 1054 scsi 9-track with 256KB buffer */ + { +
"Qualstar 1054 1/2\" Reel", 10, "QUALSTAR10", ST_TYPE_QUALSTAR, 10240, +
(ST_REEL | ST_VARIABLE | ST_BSF | ST_BSR), + 300, 300, + { 0x00, 0x02, 0x06,
0x06}, + { 0, 0, 0, 0 } + },
 /* Wangtek QIC-150 1/4" cartridge */ {
       "Wangtek QIC-150", 14, "WANGTEK 5150ES", ST_TYPE_WANGTEK, 512, (ST_QIC |
       ST_AUTODEN_OVERRIDE),


       I got my Qualstar 1054 from Bill Power at Power Computer Services for
       only $750 and have successfully read GE 9800 CT and Philips S15 MR tapes
       with it so far.  See the "Sources" section for where to get one.


       Once you have such a tape connected to the SCSI port, one can either
       write simple programs to read files (easiest if the tape has variable
       length records) or use shell scripts and the "dd" command with whatever
       the correct block size is.  See dd(1), mt(1), and mtio(3) for more
       information.  Remember that the read(2) call reads one fixed or variable
       length record at a time, and returns 0 bytes read for a tape mark, and
       two tape marks in a row indicates the end of the tape (normally).  If
       you encounter short files with a series of records 80 bytes long chances
       are you are dealing with header/end markers.  This is what ANSI standard
       tapes off VAX VMS seem to look like.


       Anyone who has any further information about tape formats and handling,
       especially references to standard or on-line documents please let me
       know.

   6.2 Ethernet

   6.3 Serial Ports


The next part is part7 - information sources.