Network Working Group                                      H. Nussbacher
Request for Comments: 1555                      Israeli Inter-University
Category: Informational                                  Computer Center
                                                            Y. Bourvine
                                                      Hebrew University
                                                          December 1993


           Hebrew Character Encoding for Internet Messages

Status of this Memo

  This memo provides information for the Internet community.  This memo
  does not specify an Internet standard of any kind.  Distribution of
  this memo is unlimited.

Abstract

  This document describes the encoding used in electronic mail [RFC822]
  for transferring Hebrew.  The standard devised makes use of MIME
  [RFC1521] and ISO-8859-8.

Description

  All Hebrew text when transferred via e-mail must first be translated
  into ISO-8859-8, and then encoded using either Quoted-Printable
  (preferable) or Base64, as defined in MIME.

  The following table provides the four most common Hebrew encodings:

                      PC    IBM     PC     ISO
          Hebrew                           8859-8
          letter     8-bit         7-bit   8-bit
                     Ascii  EBCDIC Ascii   Ascii
          ---------- -----  ------ -----   ------
          alef        128     41    96     224
          bet         129     42    97     225
          gimel       130     43    98     226
          dalet       131     44    99     227
          he          132     45   100     228
          vav         133     46   101     229
          zayin       134     47   102     230
          het         135     48   103     231
          tet         136     49   104     232
          yod         137     51   105     233
          kaf sofit   138     52   106     234
          kaf         139     53   107     235
          lamed       140     54   108     236



Nussbacher & Bourvine                                           [Page 1]

RFC 1555               Hebrew Character Encoding           December 1993


          mem sofit   141     55   109     237
          mem         142     56   110     238
          nun sofit   143     57   111     239
          nun         144     58   112     240
          samekh      145     59   113     241
          ayin        146     62   114     242
          pe sofit    147     63   115     243
          pe          148     64   116     244
          tsadi sofit 149     65   117     245
          tsadi       150     66   118     246
          qof         151     67   119     247
          resh        152     68   120     248
          shin        153     69   121     249
          tav         154     71   122     250

  Note: All values are in decimal ASCII except for the EBCDIC column
  which is in hexadecimal.

  ISO 8859-8 8-bit ASCII is also known as IBM Codepage 862.

  The default directionality of the text is visual.  This means that
  the Hebrew text is encoded from left to right (even though Hebrew
  text is entered right to left) and is transmitted from left to right
  via the standard MIME mechanisms.  Other methods to control
  directionality are supported and are covered in the complementary RFC
  1556, "Handling of Bi-directional Texts in MIME".

  All discussion regarding Hebrew in email, as well as discussions of
  Hebrew in other TCP/IP protocols, is discussed in the ilan-
  [email protected] list.  To subscribe send mail to [email protected]
  with one line of text as follows:

                   subscribe ilan-h firstname lastname

MIME Considerations

  Mail that is sent that contains Hebrew must contain the following
  minimum amount of MIME headers:

        MIME-Version: 1.0
        Content-type: text/plain; charset=ISO-8859-8
        Content-transfer-encoding: BASE64 | Quoted-Printable

  Users should keep their text to within 72 columns so as to allow
  email quoting via the prefixing of each line with a ">".  Users
  should also realize that not all MIME implementations handle email
  quoting properly, so quoting email that contains Hebrew text may lead
  to problems.



Nussbacher & Bourvine                                           [Page 2]

RFC 1555               Hebrew Character Encoding           December 1993


  In the future, when all email systems implement fully transparent 8-
  bit email as defined in RFC 1425 and RFC 1426 this standard will
  become partially obsolete.  The "Content-type:" field will still be
  necessary, as well as directionality (which might be implicit for
  8BIT, but is something for future discussion),  but the "Content-
  transfer-encoding" will be altered to use 8BIT rather than Base64 or
  Quoted-Printable.

Optional

  It is recommended, although not required, to support Hebrew encoding
  in mail headers as specified in RFC 1522.  Specifically, the Q-
  encoding format is to be the default method used for encoding Hebrew
  in Internet mail headers and not the B-encoding method.

Caveats

  Within Israel there are in excess of 40 Listserv lists which will now
  start using Hebrew for part of their conversations.  Normally,
  Listserv will deliver mail from a distribution list with a
  "shortened" header, one that does not include the extra MIME headers.
  This will cause the MIME encoding to be left intact and the user
  agent decoding software will not be able to interpret the mail.  Each
  user is able to customize how Listserv delivers mail.  For lists that
  contain Hebrew, users should send mail to Listserv with the following
  command:

                            set listname full

  where listname is the name of the list which the user wants full,
  unabridged headers to appear.  This will update their private entry
  and all subsequent mail from that list will be with full RFC822
  headers, including MIME headers.

  In addition, Listserv usually maintains automatic archives of all
  postings to a list.  These archives, contained in the file "listname
  LOGyymm", do not contain the MIME headers, so all encoding
  information will be lost.  This is a limitation of the Listserv
  software.












Nussbacher & Bourvine                                           [Page 3]

RFC 1555               Hebrew Character Encoding           December 1993


Example

  Below is a short example of Quoted-Printable encoded Hebrew email:

  Date:         Sun, 06 Jun 93 15:25:35 IDT
  From:         Hank Nussbacher <[email protected]>
  Subject:      Sample Hebrew mail
  To:           Hank Nussbacher <Hank@BARILVM>,
                Yehavi Bourvine <yehavi@hujivms>
  MIME-Version: 1.0
  Content-Type: Text/plain; charset=ISO-8859-8
  Content-Transfer-Encoding: QUOTED-PRINTABLE

  The end of this line contains Hebrew            .=EC=E0=F8=F9=E9 =F5=
  =F8=E0=EE =ED=E5=EC=F9


  Hank Nussbacher                                  =F8=EB=E1=F1=E5=
  =F0 =F7=F0=E4

Acknowledgements

  Many thanks to Rafi Sadowsky and Nathaniel Borenstein for all their
  help.

References

  [ISO-8859] Information Processing -- 8-bit Single-Byte Coded
             Graphic Character Sets, Part 8: Latin/Hebrew alphabet,
             ISO 8859-8, 1988.

  [RFC822]   Crocker, D., "Standard for the Format of ARPA Internet
             Text Messages", STD 11, RFC 822, UDEL, August 1982.

  [RFC1425]  Klensin, J., Freed N., Rose M., Stefferud E., and
             D. Crocker, "SMTP Service Extensions", RFC 1425,
             United Nations University, Innosoft International, Inc.,
             Dover Beach Consulting, Inc., Network Management
             Associates, Inc., The Branch Office, February 1993.

  [RFC1426]  Klensin, J., Freed N., Rose M., Stefferud E., and
             D. Crocker, "SMTP Service Extension for 8bit-MIME
             Transport", RFC 1426, United Nations University, Innosoft
             International, Inc., Dover Beach Consulting, Inc., Network
             Management Associates, Inc., The Branch Office, February
             1993.





Nussbacher & Bourvine                                           [Page 4]

RFC 1555               Hebrew Character Encoding           December 1993


  [RFC1521]  Borenstein N., and N. Freed, "MIME (Multipurpose
             Internet Mail Extensions) Part One: Mechanisms for
             Specifying and Describing the Format of Internet Message
             Bodies", Bellcore, Innosoft, September 1993.

  [RFC1522]  Moore K., "MIME Part Two: Message Header Extensions for
             Non-ASCII Text", University of Tennessee, September 1993.

Security Considerations

  Security issues are not discussed in this memo.

Authors' Addresses

  Hank Nussbacher
  Computer Center
  Tel Aviv University
  Ramat Aviv
  Israel

  Fax: +972 3 6409118
  Phone: +972 3 6408309
  EMail: [email protected]


  Yehavi Bourvine
  Computer Center
  Hebrew University
  Jerusalem
  Israel

  Phone: +972 2 585684
  Fax:   +972 2 527349
  EMail: [email protected]

















Nussbacher & Bourvine                                           [Page 5]