Network Working Group                                        D. Robinson
Request for Comments: 3875                                       K. Coar
Category: Informational                   The Apache Software Foundation
                                                           October 2004


            The Common Gateway Interface (CGI) Version 1.1

Status of this Memo

  This memo provides information for the Internet community.  It does
  not specify an Internet standard of any kind.  Distribution of this
  memo is unlimited.

Copyright Notice

  Copyright (C) The Internet Society (2004).

IESG Note

  This document is not a candidate for any level of Internet Standard.
  The IETF disclaims any knowledge of the fitness of this document for
  any purpose, and in particular notes that it has not had IETF review
  for such things as security, congestion control or inappropriate
  interaction with deployed protocols.  The RFC Editor has chosen to
  publish this document at its discretion.  Readers of this document
  should exercise caution in evaluating its value for implementation
  and deployment.

Abstract

  The Common Gateway Interface (CGI) is a simple interface for running
  external programs, software or gateways under an information server
  in a platform-independent manner.  Currently, the supported
  information servers are HTTP servers.

  The interface has been in use by the World-Wide Web (WWW) since 1993.
  This specification defines the 'current practice' parameters of the
  'CGI/1.1' interface developed and documented at the U.S. National
  Centre for Supercomputing Applications.  This document also defines
  the use of the CGI/1.1 interface on UNIX(R) and other, similar
  systems.









Robinson & Coar              Informational                      [Page 1]

RFC 3875                    CGI Version 1.1                 October 2004


Table of Contents

  1.  Introduction. . . . . . . . . . . . . . . . . . . . . . . . .   4
      1.1. Purpose  . . . . . . . . . . . . . . . . . . . . . . . .   4
      1.2. Requirements . . . . . . . . . . . . . . . . . . . . . .   4
      1.3. Specifications . . . . . . . . . . . . . . . . . . . . .   4
      1.4. Terminology  . . . . . . . . . . . . . . . . . . . . . .   5

  2.  Notational Conventions and Generic Grammar. . . . . . . . . .   5
      2.1. Augmented BNF  . . . . . . . . . . . . . . . . . . . . .   5
      2.2. Basic Rules  . . . . . . . . . . . . . . . . . . . . . .   6
      2.3. URL Encoding . . . . . . . . . . . . . . . . . . . . . .   7

  3.  Invoking the Script . . . . . . . . . . . . . . . . . . . . .   8
      3.1. Server Responsibilities  . . . . . . . . . . . . . . . .   8
      3.2. Script Selection . . . . . . . . . . . . . . . . . . . .   9
      3.3. The Script-URI . . . . . . . . . . . . . . . . . . . . .   9
      3.4. Execution  . . . . . . . . . . . . . . . . . . . . . . .  10

  4.  The CGI Request . . . . . . . . . . . . . . . . . . . . . . .  10
      4.1. Request Meta-Variables . . . . . . . . . . . . . . . . .  10
           4.1.1.  AUTH_TYPE. . . . . . . . . . . . . . . . . . . .  11
           4.1.2.  CONTENT_LENGTH . . . . . . . . . . . . . . . . .  12
           4.1.3.  CONTENT_TYPE . . . . . . . . . . . . . . . . . .  12
           4.1.4.  GATEWAY_INTERFACE. . . . . . . . . . . . . . . .  13
           4.1.5.  PATH_INFO. . . . . . . . . . . . . . . . . . . .  13
           4.1.6.  PATH_TRANSLATED. . . . . . . . . . . . . . . . .  14
           4.1.7.  QUERY_STRING . . . . . . . . . . . . . . . . . .  15
           4.1.8.  REMOTE_ADDR. . . . . . . . . . . . . . . . . . .  15
           4.1.9.  REMOTE_HOST. . . . . . . . . . . . . . . . . . .  16
           4.1.10. REMOTE_IDENT . . . . . . . . . . . . . . . . . .  16
           4.1.11. REMOTE_USER. . . . . . . . . . . . . . . . . . .  16
           4.1.12. REQUEST_METHOD . . . . . . . . . . . . . . . . .  17
           4.1.13. SCRIPT_NAME. . . . . . . . . . . . . . . . . . .  17
           4.1.14. SERVER_NAME. . . . . . . . . . . . . . . . . . .  17
           4.1.15. SERVER_PORT. . . . . . . . . . . . . . . . . . .  18
           4.1.16. SERVER_PROTOCOL. . . . . . . . . . . . . . . . .  18
           4.1.17. SERVER_SOFTWARE. . . . . . . . . . . . . . . . .  19
           4.1.18. Protocol-Specific Meta-Variables . . . . . . . .  19
      4.2. Request Message-Body . . . . . . . . . . . . . . . . . .  20
      4.3. Request Methods  . . . . . . . . . . . . . . . . . . . .  20
           4.3.1.  GET. . . . . . . . . . . . . . . . . . . . . . .  20
           4.3.2.  POST . . . . . . . . . . . . . . . . . . . . . .  21
           4.3.3.  HEAD . . . . . . . . . . . . . . . . . . . . . .  21
           4.3.4.  Protocol-Specific Methods. . . . . . . . . . . .  21
      4.4. The Script Command Line. . . . . . . . . . . . . . . . .  21





Robinson & Coar              Informational                      [Page 2]

RFC 3875                    CGI Version 1.1                 October 2004


  5.  NPH Scripts . . . . . . . . . . . . . . . . . . . . . . . . .  22
      5.1. Identification . . . . . . . . . . . . . . . . . . . . .  22
      5.2. NPH Response . . . . . . . . . . . . . . . . . . . . . .  22

  6.  CGI Response. . . . . . . . . . . . . . . . . . . . . . . . .  23
      6.1. Response Handling. . . . . . . . . . . . . . . . . . . .  23
      6.2. Response Types . . . . . . . . . . . . . . . . . . . . .  23
           6.2.1.  Document Response. . . . . . . . . . . . . . . .  23
           6.2.2.  Local Redirect Response. . . . . . . . . . . . .  24
           6.2.3.  Client Redirect Response . . . . . . . . . . . .  24
           6.2.4.  Client Redirect Response with Document . . . . .  24
      6.3. Response Header Fields . . . . . . . . . . . . . . . . .  25
           6.3.1.  Content-Type . . . . . . . . . . . . . . . . . .  25
           6.3.2.  Location . . . . . . . . . . . . . . . . . . . .  26
           6.3.3.  Status . . . . . . . . . . . . . . . . . . . . .  26
           6.3.4.  Protocol-Specific Header Fields. . . . . . . . .  27
           6.3.5.  Extension Header Fields. . . . . . . . . . . . .  27
      6.4. Response Message-Body. . . . . . . . . . . . . . . . . .  28

  7.  System Specifications . . . . . . . . . . . . . . . . . . . .  28
      7.1. AmigaDOS . . . . . . . . . . . . . . . . . . . . . . . .  28
      7.2. UNIX . . . . . . . . . . . . . . . . . . . . . . . . . .  28
      7.3. EBCDIC/POSIX . . . . . . . . . . . . . . . . . . . . . .  29

  8.  Implementation. . . . . . . . . . . . . . . . . . . . . . . .  29
      8.1. Recommendations for Servers. . . . . . . . . . . . . . .  29
      8.2. Recommendations for Scripts. . . . . . . . . . . . . . .  30

  9.  Security Considerations . . . . . . . . . . . . . . . . . . .  30
      9.1. Safe Methods . . . . . . . . . . . . . . . . . . . . . .  30
      9.2. Header Fields Containing Sensitive Information . . . . .  31
      9.3. Data Privacy . . . . . . . . . . . . . . . . . . . . . .  31
      9.4. Information Security Model . . . . . . . . . . . . . . .  31
      9.5. Script Interference with the Server. . . . . . . . . . .  31
      9.6. Data Length and Buffering Considerations . . . . . . . .  32
      9.7. Stateless Processing . . . . . . . . . . . . . . . . . .  32
      9.8. Relative Paths . . . . . . . . . . . . . . . . . . . . .  33
      9.9. Non-parsed Header Output . . . . . . . . . . . . . . . .  33

  10. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . .  33

  11. References. . . . . . . . . . . . . . . . . . . . . . . . . .  33
      11.1. Normative References. . . . . . . . . . . . . . . . . .  33
      11.2. Informative References. . . . . . . . . . . . . . . . .  34

  12. Authors' Addresses. . . . . . . . . . . . . . . . . . . . . .  35

  13. Full Copyright Statement. . . . . . . . . . . . . . . . . . .  36



Robinson & Coar              Informational                      [Page 3]

RFC 3875                    CGI Version 1.1                 October 2004


1.  Introduction

1.1.  Purpose

  The Common Gateway Interface (CGI) [22] allows an HTTP [1], [4]
  server and a CGI script to share responsibility for responding to
  client requests.  The client request comprises a Uniform Resource
  Identifier (URI) [11], a request method and various ancillary
  information about the request provided by the transport protocol.

  The CGI defines the abstract parameters, known as meta-variables,
  which describe a client's request.  Together with a concrete
  programmer interface this specifies a platform-independent interface
  between the script and the HTTP server.

  The server is responsible for managing connection, data transfer,
  transport and network issues related to the client request, whereas
  the CGI script handles the application issues, such as data access
  and document processing.

1.2.  Requirements

  The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT',
  'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'MAY' and 'OPTIONAL' in this
  document are to be interpreted as described in BCP 14, RFC 2119 [3].

  An implementation is not compliant if it fails to satisfy one or more
  of the 'must' requirements for the protocols it implements.  An
  implementation that satisfies all of the 'must' and all of the
  'should' requirements for its features is said to be 'unconditionally
  compliant'; one that satisfies all of the 'must' requirements but not
  all of the 'should' requirements for its features is said to be
  'conditionally compliant'.

1.3.  Specifications

  Not all of the functions and features of the CGI are defined in the
  main part of this specification.  The following phrases are used to
  describe the features that are not specified:

  'system-defined'
     The feature may differ between systems, but must be the same for
     different implementations using the same system.  A system will
     usually identify a class of operating systems.  Some systems are
     defined in section 7 of this document.  New systems may be defined
     by new specifications without revision of this document.





Robinson & Coar              Informational                      [Page 4]

RFC 3875                    CGI Version 1.1                 October 2004


  'implementation-defined'
     The behaviour of the feature may vary from implementation to
     implementation; a particular implementation must document its
     behaviour.

1.4.  Terminology

  This specification uses many terms defined in the HTTP/1.1
  specification [4]; however, the following terms are used here in a
  sense which may not accord with their definitions in that document,
  or with their common meaning.

  'meta-variable'
     A named parameter which carries information from the server to the
     script.  It is not necessarily a variable in the operating
     system's environment, although that is the most common
     implementation.

  'script'
     The software that is invoked by the server according to this
     interface.  It need not be a standalone program, but could be a
     dynamically-loaded or shared library, or even a subroutine in the
     server.  It might be a set of statements interpreted at run-time,
     as the term 'script' is frequently understood, but that is not a
     requirement and within the context of this specification the term
     has the broader definition stated.

  'server'
     The application program that invokes the script in order to
     service requests from the client.

2.  Notational Conventions and Generic Grammar

2.1.  Augmented BNF

  All of the mechanisms specified in this document are described in
  both prose and an augmented Backus-Naur Form (BNF) similar to that
  used by RFC 822 [13].  Unless stated otherwise, the elements are
  case-sensitive.  This augmented BNF contains the following
  constructs:

  name = definition
     The name of a rule and its definition are separated by the equals
     character ('=').  Whitespace is only significant in that
     continuation lines of a definition are indented.






Robinson & Coar              Informational                      [Page 5]

RFC 3875                    CGI Version 1.1                 October 2004


  "literal"
     Double quotation marks (") surround literal text, except for a
     literal quotation mark, which is surrounded by angle-brackets ('<'
     and '>').

  rule1 | rule2
     Alternative rules are separated by a vertical bar ('|').

  (rule1 rule2 rule3)
     Elements enclosed in parentheses are treated as a single element.

  *rule
     A rule preceded by an asterisk ('*') may have zero or more
     occurrences.  The full form is 'n*m rule' indicating at least n
     and at most m occurrences of the rule.  n and m are optional
     decimal values with default values of 0 and infinity respectively.

  [rule]
     An element enclosed in square brackets ('[' and ']') is optional,
     and is equivalent to '*1 rule'.

  N rule
     A rule preceded by a decimal number represents exactly N
     occurrences of the rule.  It is equivalent to 'N*N rule'.

2.2.  Basic Rules

  This specification uses a BNF-like grammar defined in terms of
  characters.  Unlike many specifications which define the bytes
  allowed by a protocol, here each literal in the grammar corresponds
  to the character it represents.  How these characters are represented
  in terms of bits and bytes within a system are either system-defined
  or specified in the particular context.  The single exception is the
  rule 'OCTET', defined below.

  The following rules are used throughout this specification to
  describe basic parsing constructs.

     alpha         = lowalpha | hialpha
     lowalpha      = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
                     "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
                     "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
                     "y" | "z"
     hialpha       = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
                     "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
                     "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
                     "Y" | "Z"




Robinson & Coar              Informational                      [Page 6]

RFC 3875                    CGI Version 1.1                 October 2004


     digit         = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
                     "8" | "9"
     alphanum      = alpha | digit
     OCTET         = <any 8-bit byte>
     CHAR          = alpha | digit | separator | "!" | "#" | "$" |
                     "%" | "&" | "'" | "*" | "+" | "-" | "." | "`" |
                     "^" | "_" | "{" | "|" | "}" | "~" | CTL
     CTL           = <any control character>
     SP            = <space character>
     HT            = <horizontal tab character>
     NL            = <newline>
     LWSP          = SP | HT | NL
     separator     = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" |
                     "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" |
                     "}" | SP | HT
     token         = 1*<any CHAR except CTLs or separators>
     quoted-string = <"> *qdtext <">
     qdtext        = <any CHAR except <"> and CTLs but including LWSP>
     TEXT          = <any printable character>

  Note that newline (NL) need not be a single control character, but
  can be a sequence of control characters.  A system MAY define TEXT to
  be a larger set of characters than <any CHAR excluding CTLs but
  including LWSP>.

2.3.  URL Encoding

  Some variables and constructs used here are described as being
  'URL-encoded'.  This encoding is described in section 2 of RFC 2396
  [2].  In a URL-encoded string an escape sequence consists of a
  percent character ("%") followed by two hexadecimal digits, where the
  two hexadecimal digits form an octet.  An escape sequence represents
  the graphic character that has the octet as its code within the
  US-ASCII [9] coded character set, if it exists.  Currently there is
  no provision within the URI syntax to identify which character set
  non-ASCII codes represent, so CGI handles this issue on an ad-hoc
  basis.

  Note that some unsafe (reserved) characters may have different
  semantics when encoded.  The definition of which characters are
  unsafe depends on the context; see section 2 of RFC 2396 [2], updated
  by RFC 2732 [7], for an authoritative treatment.  These reserved
  characters are generally used to provide syntactic structure to the
  character string, for example as field separators.  In all cases, the
  string is first processed with regard to any reserved characters
  present, and then the resulting data can be URL-decoded by replacing
  "%" escape sequences by their character values.




Robinson & Coar              Informational                      [Page 7]

RFC 3875                    CGI Version 1.1                 October 2004


  To encode a character string, all reserved and forbidden characters
  are replaced by the corresponding "%" escape sequences.  The string
  can then be used in assembling a URI.  The reserved characters will
  vary from context to context, but will always be drawn from this set:

     reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" |
                "," | "[" | "]"

  The last two characters were added by RFC 2732 [7].  In any
  particular context, a sub-set of these characters will be reserved;
  the other characters from this set MUST NOT be encoded when a string
  is URL-encoded in that context.  Other basic rules used to describe
  URI syntax are:

     hex        = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b"
                  | "c" | "d" | "e" | "f"
     escaped    = "%" hex hex
     unreserved = alpha | digit | mark
     mark       = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"

3.  Invoking the Script

3.1.  Server Responsibilities

  The server acts as an application gateway.  It receives the request
  from the client, selects a CGI script to handle the request, converts
  the client request to a CGI request, executes the script and converts
  the CGI response into a response for the client.  When processing the
  client request, it is responsible for implementing any protocol or
  transport level authentication and security.  The server MAY also
  function in a 'non-transparent' manner, modifying the request or
  response in order to provide some additional service, such as media
  type transformation or protocol reduction.

  The server MUST perform translations and protocol conversions on the
  client request data required by this specification.  Furthermore, the
  server retains its responsibility to the client to conform to the
  relevant network protocol even if the CGI script fails to conform to
  this specification.

  If the server is applying authentication to the request, then it MUST
  NOT execute the script unless the request passes all defined access
  controls.








Robinson & Coar              Informational                      [Page 8]

RFC 3875                    CGI Version 1.1                 October 2004


3.2.  Script Selection

  The server determines which CGI is script to be executed based on a
  generic-form URI supplied by the client.  This URI includes a
  hierarchical path with components separated by "/".  For any
  particular request, the server will identify all or a leading part of
  this path with an individual script, thus placing the script at a
  particular point in the path hierarchy.  The remainder of the path,
  if any, is a resource or sub-resource identifier to be interpreted by
  the script.

  Information about this split of the path is available to the script
  in the meta-variables, described below.  Support for non-hierarchical
  URI schemes is outside the scope of this specification.

3.3.  The Script-URI

  The mapping from client request URI to choice of script is defined by
  the particular server implementation and its configuration.  The
  server may allow the script to be identified with a set of several
  different URI path hierarchies, and therefore is permitted to replace
  the URI by other members of this set during processing and generation
  of the meta-variables.  The server

     1. MAY preserve the URI in the particular client request; or

     2. it MAY select a canonical URI from the set of possible values
        for each script; or

     3. it can implement any other selection of URI from the set.

  From the meta-variables thus generated, a URI, the 'Script-URI', can
  be constructed.  This MUST have the property that if the client had
  accessed this URI instead, then the script would have been executed
  with the same values for the SCRIPT_NAME, PATH_INFO and QUERY_STRING
  meta-variables.  The Script-URI has the structure of a generic URI as
  defined in section 3 of RFC 2396 [2], with the exception that object
  parameters and fragment identifiers are not permitted.  The various
  components of the Script-URI are defined by some of the
  meta-variables (see below);

     script-URI = <scheme> "://" <server-name> ":" <server-port>
                  <script-path> <extra-path> "?" <query-string>

  where <scheme> is found from SERVER_PROTOCOL, <server-name>,
  <server-port> and <query-string> are the values of the respective
  meta-variables.  The SCRIPT_NAME and PATH_INFO values, URL-encoded
  with ";", "=" and "?"  reserved, give <script-path> and <extra-path>.



Robinson & Coar              Informational                      [Page 9]

RFC 3875                    CGI Version 1.1                 October 2004


  See section 4.1.5 for more information about the PATH_INFO
  meta-variable.

  The scheme and the protocol are not identical as the scheme
  identifies the access method in addition to the application protocol.
  For example, a resource accessed using Transport Layer Security (TLS)
  [14] would have a request URI with a scheme of https when using the
  HTTP protocol [19].  CGI/1.1 provides no generic means for the script
  to reconstruct this, and therefore the Script-URI as defined includes
  the base protocol used.  However, a script MAY make use of
  scheme-specific meta-variables to better deduce the URI scheme.

  Note that this definition also allows URIs to be constructed which
  would invoke the script with any permitted values for the path-info
  or query-string, by modifying the appropriate components.

3.4.  Execution

  The script is invoked in a system-defined manner.  Unless specified
  otherwise, the file containing the script will be invoked as an
  executable program.  The server prepares the CGI request as described
  in section 4; this comprises the request meta-variables (immediately
  available to the script on execution) and request message data.  The
  request data need not be immediately available to the script; the
  script can be executed before all this data has been received by the
  server from the client.  The response from the script is returned to
  the server as described in sections 5 and 6.

  In the event of an error condition, the server can interrupt or
  terminate script execution at any time and without warning.  That
  could occur, for example, in the event of a transport failure between
  the server and the client; so the script SHOULD be prepared to handle
  abnormal termination.

4.  The CGI Request

  Information about a request comes from two different sources; the
  request meta-variables and any associated message-body.

4.1.  Request Meta-Variables

  Meta-variables contain data about the request passed from the server
  to the script, and are accessed by the script in a system-defined
  manner.  Meta-variables are identified by case-insensitive names;
  there cannot be two different variables whose names differ in case
  only.  Here they are shown using a canonical representation of
  capitals plus underscore ("_").  A particular system can define a
  different representation.



Robinson & Coar              Informational                     [Page 10]

RFC 3875                    CGI Version 1.1                 October 2004


     meta-variable-name = "AUTH_TYPE" | "CONTENT_LENGTH" |
                          "CONTENT_TYPE" | "GATEWAY_INTERFACE" |
                          "PATH_INFO" | "PATH_TRANSLATED" |
                          "QUERY_STRING" | "REMOTE_ADDR" |
                          "REMOTE_HOST" | "REMOTE_IDENT" |
                          "REMOTE_USER" | "REQUEST_METHOD" |
                          "SCRIPT_NAME" | "SERVER_NAME" |
                          "SERVER_PORT" | "SERVER_PROTOCOL" |
                          "SERVER_SOFTWARE" | scheme |
                          protocol-var-name | extension-var-name
     protocol-var-name  = ( protocol | scheme ) "_" var-name
     scheme             = alpha *( alpha | digit | "+" | "-" | "." )
     var-name           = token
     extension-var-name = token

  Meta-variables with the same name as a scheme, and names beginning
  with the name of a protocol or scheme (e.g., HTTP_ACCEPT) are also
  defined.  The number and meaning of these variables may change
  independently of this specification.  (See also section 4.1.18.)

  The server MAY set additional implementation-defined extension meta-
  variables, whose names SHOULD be prefixed with "X_".

  This specification does not distinguish between zero-length (NULL)
  values and missing values.  For example, a script cannot distinguish
  between the two requests http://host/script and http://host/script?
  as in both cases the QUERY_STRING meta-variable would be NULL.

     meta-variable-value = "" | 1*<TEXT, CHAR or tokens of value>

  An optional meta-variable may be omitted (left unset) if its value is
  NULL.  Meta-variable values MUST be considered case-sensitive except
  as noted otherwise.  The representation of the characters in the
  meta-variables is system-defined; the server MUST convert values to
  that representation.

4.1.1.  AUTH_TYPE

  The AUTH_TYPE variable identifies any mechanism used by the server to
  authenticate the user.  It contains a case-insensitive value defined
  by the client protocol or server implementation.

  For HTTP, if the client request required authentication for external
  access, then the server MUST set the value of this variable from the
  'auth-scheme' token in the request Authorization header field.






Robinson & Coar              Informational                     [Page 11]

RFC 3875                    CGI Version 1.1                 October 2004


     AUTH_TYPE      = "" | auth-scheme
     auth-scheme    = "Basic" | "Digest" | extension-auth
     extension-auth = token

  HTTP access authentication schemes are described in RFC 2617 [5].

4.1.2.  CONTENT_LENGTH

  The CONTENT_LENGTH variable contains the size of the message-body
  attached to the request, if any, in decimal number of octets.  If no
  data is attached, then NULL (or unset).

     CONTENT_LENGTH = "" | 1*digit

  The server MUST set this meta-variable if and only if the request is
  accompanied by a message-body entity.  The CONTENT_LENGTH value must
  reflect the length of the message-body after the server has removed
  any transfer-codings or content-codings.

4.1.3.  CONTENT_TYPE

  If the request includes a message-body, the CONTENT_TYPE variable is
  set to the Internet Media Type [6] of the message-body.

     CONTENT_TYPE = "" | media-type
     media-type   = type "/" subtype *( ";" parameter )
     type         = token
     subtype      = token
     parameter    = attribute "=" value
     attribute    = token
     value        = token | quoted-string

  The type, subtype and parameter attribute names are not
  case-sensitive.  Parameter values may be case sensitive.  Media types
  and their use in HTTP are described section 3.7 of the HTTP/1.1
  specification [4].

  There is no default value for this variable.  If and only if it is
  unset, then the script MAY attempt to determine the media type from
  the data received.  If the type remains unknown, then the script MAY
  choose to assume a type of application/octet-stream or it may reject
  the request with an error (as described in section 6.3.3).

  Each media-type defines a set of optional and mandatory parameters.
  This may include a charset parameter with a case-insensitive value
  defining the coded character set for the message-body.  If the





Robinson & Coar              Informational                     [Page 12]

RFC 3875                    CGI Version 1.1                 October 2004


  charset parameter is omitted, then the default value should be
  derived according to whichever of the following rules is the first to
  apply:

     1. There MAY be a system-defined default charset for some
        media-types.

     2. The default for media-types of type "text" is ISO-8859-1 [4].

     3. Any default defined in the media-type specification.

     4. The default is US-ASCII.

  The server MUST set this meta-variable if an HTTP Content-Type field
  is present in the client request header.  If the server receives a
  request with an attached entity but no Content-Type header field, it
  MAY attempt to determine the correct content type, otherwise it
  should omit this meta-variable.

4.1.4.  GATEWAY_INTERFACE

  The GATEWAY_INTERFACE variable MUST be set to the dialect of CGI
  being used by the server to communicate with the script.  Syntax:

     GATEWAY_INTERFACE = "CGI" "/" 1*digit "." 1*digit

  Note that the major and minor numbers are treated as separate
  integers and hence each may be incremented higher than a single
  digit.  Thus CGI/2.4 is a lower version than CGI/2.13 which in turn
  is lower than CGI/12.3.  Leading zeros MUST be ignored by the script
  and MUST NOT be generated by the server.

  This document defines the 1.1 version of the CGI interface.

4.1.5.  PATH_INFO

  The PATH_INFO variable specifies a path to be interpreted by the CGI
  script.  It identifies the resource or sub-resource to be returned by
  the CGI script, and is derived from the portion of the URI path
  hierarchy following the part that identifies the script itself.
  Unlike a URI path, the PATH_INFO is not URL-encoded, and cannot
  contain path-segment parameters.  A PATH_INFO of "/" represents a
  single void path segment.

     PATH_INFO = "" | ( "/" path )
     path      = lsegment *( "/" lsegment )
     lsegment  = *lchar
     lchar     = <any TEXT or CTL except "/">



Robinson & Coar              Informational                     [Page 13]

RFC 3875                    CGI Version 1.1                 October 2004


  The value is considered case-sensitive and the server MUST preserve
  the case of the path as presented in the request URI.  The server MAY
  impose restrictions and limitations on what values it permits for
  PATH_INFO, and MAY reject the request with an error if it encounters
  any values considered objectionable.  That MAY include any requests
  that would result in an encoded "/" being decoded into PATH_INFO, as
  this might represent a loss of information to the script.  Similarly,
  treatment of non US-ASCII characters in the path is system-defined.

  URL-encoded, the PATH_INFO string forms the extra-path component of
  the Script-URI (see section 3.3) which follows the SCRIPT_NAME part
  of that path.

4.1.6.  PATH_TRANSLATED

  The PATH_TRANSLATED variable is derived by taking the PATH_INFO
  value, parsing it as a local URI in its own right, and performing any
  virtual-to-physical translation appropriate to map it onto the
  server's document repository structure.  The set of characters
  permitted in the result is system-defined.

     PATH_TRANSLATED = *<any character>

  This is the file location that would be accessed by a request for

     <scheme> "://" <server-name> ":" <server-port> <extra-path>

  where <scheme> is the scheme for the original client request and
  <extra-path> is a URL-encoded version of PATH_INFO, with ";", "=" and
  "?"  reserved.  For example, a request such as the following:

     http://somehost.com/cgi-bin/somescript/this%2eis%2epath%3binfo

  would result in a PATH_INFO value of

     /this.is.the.path;info

  An internal URI is constructed from the scheme, server location and
  the URL-encoded PATH_INFO:

     http://somehost.com/this.is.the.path%3binfo

  This would then be translated to a location in the server's document
  repository, perhaps a filesystem path something like this:

     /usr/local/www/htdocs/this.is.the.path;info

  The value of PATH_TRANSLATED is the result of the translation.



Robinson & Coar              Informational                     [Page 14]

RFC 3875                    CGI Version 1.1                 October 2004


  The value is derived in this way irrespective of whether it maps to a
  valid repository location.  The server MUST preserve the case of the
  extra-path segment unless the underlying repository supports case-
  insensitive names.  If the repository is only case-aware, case-
  preserving, or case-blind with regard to document names, the server
  is not required to preserve the case of the original segment through
  the translation.

  The translation algorithm the server uses to derive PATH_TRANSLATED
  is implementation-defined; CGI scripts which use this variable may
  suffer limited portability.

  The server SHOULD set this meta-variable if the request URI includes
  a path-info component.  If PATH_INFO is NULL, then the
  PATH_TRANSLATED variable MUST be set to NULL (or unset).

4.1.7.  QUERY_STRING

  The QUERY_STRING variable contains a URL-encoded search or parameter
  string; it provides information to the CGI script to affect or refine
  the document to be returned by the script.

  The URL syntax for a search string is described in section 3 of RFC
  2396 [2].  The QUERY_STRING value is case-sensitive.

     QUERY_STRING = query-string
     query-string = *uric
     uric         = reserved | unreserved | escaped

  When parsing and decoding the query string, the details of the
  parsing, reserved characters and support for non US-ASCII characters
  depends on the context.  For example, form submission from an HTML
  document [18] uses application/x-www-form-urlencoded encoding, in
  which the characters "+", "&" and "=" are reserved, and the ISO
  8859-1 encoding may be used for non US-ASCII characters.

  The QUERY_STRING value provides the query-string part of the
  Script-URI.  (See section 3.3).

  The server MUST set this variable; if the Script-URI does not include
  a query component, the QUERY_STRING MUST be defined as an empty
  string ("").

4.1.8.  REMOTE_ADDR

  The REMOTE_ADDR variable MUST be set to the network address of the
  client sending the request to the server.




Robinson & Coar              Informational                     [Page 15]

RFC 3875                    CGI Version 1.1                 October 2004


     REMOTE_ADDR  = hostnumber
     hostnumber   = ipv4-address | ipv6-address
     ipv4-address = 1*3digit "." 1*3digit "." 1*3digit "." 1*3digit
     ipv6-address = hexpart [ ":" ipv4-address ]
     hexpart      = hexseq | ( [ hexseq ] "::" [ hexseq ] )
     hexseq       = 1*4hex *( ":" 1*4hex )

  The format of an IPv6 address is described in RFC 3513 [15].

4.1.9.  REMOTE_HOST

  The REMOTE_HOST variable contains the fully qualified domain name of
  the client sending the request to the server, if available, otherwise
  NULL.  Fully qualified domain names take the form as described in
  section 3.5 of RFC 1034 [17] and section 2.1 of RFC 1123 [12].
  Domain names are not case sensitive.

     REMOTE_HOST   = "" | hostname | hostnumber
     hostname      = *( domainlabel "." ) toplabel [ "." ]
     domainlabel   = alphanum [ *alphahypdigit alphanum ]
     toplabel      = alpha [ *alphahypdigit alphanum ]
     alphahypdigit = alphanum | "-"

  The server SHOULD set this variable.  If the hostname is not
  available for performance reasons or otherwise, the server MAY
  substitute the REMOTE_ADDR value.

4.1.10.  REMOTE_IDENT

  The REMOTE_IDENT variable MAY be used to provide identity information
  reported about the connection by an RFC 1413 [20] request to the
  remote agent, if available.  The server may choose not to support
  this feature, or not to request the data for efficiency reasons, or
  not to return available identity data.

     REMOTE_IDENT = *TEXT

  The data returned may be used for authentication purposes, but the
  level of trust reposed in it should be minimal.

4.1.11.  REMOTE_USER

  The REMOTE_USER variable provides a user identification string
  supplied by client as part of user authentication.

     REMOTE_USER = *TEXT





Robinson & Coar              Informational                     [Page 16]

RFC 3875                    CGI Version 1.1                 October 2004


  If the client request required HTTP Authentication [5] (e.g., the
  AUTH_TYPE meta-variable is set to "Basic" or "Digest"), then the
  value of the REMOTE_USER meta-variable MUST be set to the user-ID
  supplied.

4.1.12.  REQUEST_METHOD

  The REQUEST_METHOD meta-variable MUST be set to the method which
  should be used by the script to process the request, as described in
  section 4.3.

     REQUEST_METHOD   = method
     method           = "GET" | "POST" | "HEAD" | extension-method
     extension-method = "PUT" | "DELETE" | token

  The method is case sensitive.  The HTTP methods are described in
  section 5.1.1 of the HTTP/1.0 specification [1] and section 5.1.1 of
  the HTTP/1.1 specification [4].

4.1.13.  SCRIPT_NAME

  The SCRIPT_NAME variable MUST be set to a URI path (not URL-encoded)
  which could identify the CGI script (rather than the script's
  output).  The syntax is the same as for PATH_INFO (section 4.1.5)

     SCRIPT_NAME = "" | ( "/" path )

  The leading "/" is not part of the path.  It is optional if the path
  is NULL; however, the variable MUST still be set in that case.

  The SCRIPT_NAME string forms some leading part of the path component
  of the Script-URI derived in some implementation-defined manner.  No
  PATH_INFO segment (see section 4.1.5) is included in the SCRIPT_NAME
  value.

4.1.14.  SERVER_NAME

  The SERVER_NAME variable MUST be set to the name of the server host
  to which the client request is directed.  It is a case-insensitive
  hostname or network address.  It forms the host part of the
  Script-URI.

     SERVER_NAME = server-name
     server-name = hostname | ipv4-address | ( "[" ipv6-address "]" )







Robinson & Coar              Informational                     [Page 17]

RFC 3875                    CGI Version 1.1                 October 2004


  A deployed server can have more than one possible value for this
  variable, where several HTTP virtual hosts share the same IP address.
  In that case, the server would use the contents of the request's Host
  header field to select the correct virtual host.

4.1.15.  SERVER_PORT

  The SERVER_PORT variable MUST be set to the TCP/IP port number on
  which this request is received from the client.  This value is used
  in the port part of the Script-URI.

     SERVER_PORT = server-port
     server-port = 1*digit

  Note that this variable MUST be set, even if the port is the default
  port for the scheme and could otherwise be omitted from a URI.

4.1.16.  SERVER_PROTOCOL

  The SERVER_PROTOCOL variable MUST be set to the name and version of
  the application protocol used for this CGI request.  This MAY differ
  from the protocol version used by the server in its communication
  with the client.

     SERVER_PROTOCOL   = HTTP-Version | "INCLUDED" | extension-version
     HTTP-Version      = "HTTP" "/" 1*digit "." 1*digit
     extension-version = protocol [ "/" 1*digit "." 1*digit ]
     protocol          = token

  Here, 'protocol' defines the syntax of some of the information
  passing between the server and the script (the 'protocol-specific'
  features).  It is not case sensitive and is usually presented in
  upper case.  The protocol is not the same as the scheme part of the
  script URI, which defines the overall access mechanism used by the
  client to communicate with the server.  For example, a request that
  reaches the script with a protocol of "HTTP" may have used an "https"
  scheme.

  A well-known value for SERVER_PROTOCOL which the server MAY use is
  "INCLUDED", which signals that the current document is being included
  as part of a composite document, rather than being the direct target
  of the client request.  The script should treat this as an HTTP/1.0
  request.








Robinson & Coar              Informational                     [Page 18]

RFC 3875                    CGI Version 1.1                 October 2004


4.1.17.  SERVER_SOFTWARE

  The SERVER_SOFTWARE meta-variable MUST be set to the name and version
  of the information server software making the CGI request (and
  running the gateway).  It SHOULD be the same as the server
  description reported to the client, if any.

     SERVER_SOFTWARE = 1*( product | comment )
     product         = token [ "/" product-version ]
     product-version = token
     comment         = "(" *( ctext | comment ) ")"
     ctext           = <any TEXT excluding "(" and ")">

4.1.18.  Protocol-Specific Meta-Variables

  The server SHOULD set meta-variables specific to the protocol and
  scheme for the request.  Interpretation of protocol-specific
  variables depends on the protocol version in SERVER_PROTOCOL.  The
  server MAY set a meta-variable with the name of the scheme to a
  non-NULL value if the scheme is not the same as the protocol.  The
  presence of such a variable indicates to a script which scheme is
  used by the request.

  Meta-variables with names beginning with "HTTP_" contain values read
  from the client request header fields, if the protocol used is HTTP.
  The HTTP header field name is converted to upper case, has all
  occurrences of "-" replaced with "_" and has "HTTP_" prepended to
  give the meta-variable name.  The header data can be presented as
  sent by the client, or can be rewritten in ways which do not change
  its semantics.  If multiple header fields with the same field-name
  are received then the server MUST rewrite them as a single value
  having the same semantics.  Similarly, a header field that spans
  multiple lines MUST be merged onto a single line.  The server MUST,
  if necessary, change the representation of the data (for example, the
  character set) to be appropriate for a CGI meta-variable.

  The server is not required to create meta-variables for all the
  header fields that it receives.  In particular, it SHOULD remove any
  header fields carrying authentication information, such as
  'Authorization'; or that are available to the script in other
  variables, such as 'Content-Length' and 'Content-Type'.  The server
  MAY remove header fields that relate solely to client-side
  communication issues, such as 'Connection'.








Robinson & Coar              Informational                     [Page 19]

RFC 3875                    CGI Version 1.1                 October 2004


4.2.  Request Message-Body

  Request data is accessed by the script in a system-defined method;
  unless defined otherwise, this will be by reading the 'standard
  input' file descriptor or file handle.

     Request-Data   = [ request-body ] [ extension-data ]
     request-body   = <CONTENT_LENGTH>OCTET
     extension-data = *OCTET

  A request-body is supplied with the request if the CONTENT_LENGTH is
  not NULL.  The server MUST make at least that many bytes available
  for the script to read.  The server MAY signal an end-of-file
  condition after CONTENT_LENGTH bytes have been read or it MAY supply
  extension data.  Therefore, the script MUST NOT attempt to read more
  than CONTENT_LENGTH bytes, even if more data is available.  However,
  it is not obliged to read any of the data.

  For non-parsed header (NPH) scripts (section 5), the server SHOULD
  attempt to ensure that the data supplied to the script is precisely
  as supplied by the client and is unaltered by the server.

  As transfer-codings are not supported on the request-body, the server
  MUST remove any such codings from the message-body, and recalculate
  the CONTENT_LENGTH.  If this is not possible (for example, because of
  large buffering requirements), the server SHOULD reject the client
  request.  It MAY also remove content-codings from the message-body.

4.3.  Request Methods

  The Request Method, as supplied in the REQUEST_METHOD meta-variable,
  identifies the processing method to be applied by the script in
  producing a response.  The script author can choose to implement the
  methods most appropriate for the particular application.  If the
  script receives a request with a method it does not support it SHOULD
  reject it with an error (see section 6.3.3).

4.3.1.  GET

  The GET method indicates that the script should produce a document
  based on the meta-variable values.  By convention, the GET method is
  'safe' and 'idempotent' and SHOULD NOT have the significance of
  taking an action other than producing a document.

  The meaning of the GET method may be modified and refined by
  protocol-specific meta-variables.





Robinson & Coar              Informational                     [Page 20]

RFC 3875                    CGI Version 1.1                 October 2004


4.3.2.  POST

  The POST method is used to request the script perform processing and
  produce a document based on the data in the request message-body, in
  addition to meta-variable values.  A common use is form submission in
  HTML [18], intended to initiate processing by the script that has a
  permanent affect, such a change in a database.

  The script MUST check the value of the CONTENT_LENGTH variable before
  reading the attached message-body, and SHOULD check the CONTENT_TYPE
  value before processing it.

4.3.3.  HEAD

  The HEAD method requests the script to do sufficient processing to
  return the response header fields, without providing a response
  message-body.  The script MUST NOT provide a response message-body
  for a HEAD request.  If it does, then the server MUST discard the
  message-body when reading the response from the script.

4.3.4.  Protocol-Specific Methods

  The script MAY implement any protocol-specific method, such as
  HTTP/1.1 PUT and DELETE; it SHOULD check the value of SERVER_PROTOCOL
  when doing so.

  The server MAY decide that some methods are not appropriate or
  permitted for a script, and may handle the methods itself or return
  an error to the client.

4.4.  The Script Command Line

  Some systems support a method for supplying an array of strings to
  the CGI script.  This is only used in the case of an 'indexed' HTTP
  query, which is identified by a 'GET' or 'HEAD' request with a URI
  query string that does not contain any unencoded "=" characters.  For
  such a request, the server SHOULD treat the query-string as a
  search-string and parse it into words, using the rules

     search-string = search-word *( "+" search-word )
     search-word   = 1*schar
     schar         = unreserved | escaped | xreserved
     xreserved     = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "," |
                     "$"

  After parsing, each search-word is URL-decoded, optionally encoded in
  a system-defined manner and then added to the command line argument
  list.



Robinson & Coar              Informational                     [Page 21]

RFC 3875                    CGI Version 1.1                 October 2004


  If the server cannot create any part of the argument list, then the
  server MUST NOT generate any command line information.  For example,
  the number of arguments may be greater than operating system or
  server limits, or one of the words may not be representable as an
  argument.

  The script SHOULD check to see if the QUERY_STRING value contains an
  unencoded "=" character, and SHOULD NOT use the command line
  arguments if it does.

5.  NPH Scripts

5.1.  Identification

  The server MAY support NPH (Non-Parsed Header) scripts; these are
  scripts to which the server passes all responsibility for response
  processing.

  This specification provides no mechanism for an NPH script to be
  identified on the basis of its output data alone.  By convention,
  therefore, any particular script can only ever provide output of one
  type (NPH or CGI) and hence the script itself is described as an 'NPH
  script'.  A server with NPH support MUST provide an implementation-
  defined mechanism for identifying NPH scripts, perhaps based on the
  name or location of the script.

5.2.  NPH Response

  There MUST be a system-defined method for the script to send data
  back to the server or client; a script MUST always return some data.
  Unless defined otherwise, this will be the same as for conventional
  CGI scripts.

  Currently, NPH scripts are only defined for HTTP client requests.  An
  (HTTP) NPH script MUST return a complete HTTP response message,
  currently described in section 6 of the HTTP specifications [1], [4].
  The script MUST use the SERVER_PROTOCOL variable to determine the
  appropriate format for a response.  It MUST also take account of any
  generic or protocol-specific meta-variables in the request as might
  be mandated by the particular protocol specification.

  The server MUST ensure that the script output is sent to the client
  unmodified.  Note that this requires the script to use the correct
  character set (US-ASCII [9] and ISO 8859-1 [10] for HTTP) in the
  header fields.  The server SHOULD attempt to ensure that the script
  output is sent directly to the client, with minimal internal and no
  transport-visible buffering.




Robinson & Coar              Informational                     [Page 22]

RFC 3875                    CGI Version 1.1                 October 2004


  Unless the implementation defines otherwise, the script MUST NOT
  indicate in its response that the client can send further requests
  over the same connection.

6.  CGI Response

6.1.  Response Handling

  A script MUST always provide a non-empty response, and so there is a
  system-defined method for it to send this data back to the server.
  Unless defined otherwise, this will be via the 'standard output' file
  descriptor.

  The script MUST check the REQUEST_METHOD variable when processing the
  request and preparing its response.

  The server MAY implement a timeout period within which data must be
  received from the script.  If a server implementation defines such a
  timeout and receives no data from a script within the timeout period,
  the server MAY terminate the script process.

6.2.  Response Types

  The response comprises a message-header and a message-body, separated
  by a blank line.  The message-header contains one or more header
  fields.  The body may be NULL.

     generic-response = 1*header-field NL [ response-body ]

  The script MUST return one of either a document response, a local
  redirect response or a client redirect (with optional document)
  response.  In the response definitions below, the order of header
  fields in a response is not significant (despite appearing so in the
  BNF).  The header fields are defined in section 6.3.

     CGI-Response = document-response | local-redir-response |
                    client-redir-response | client-redirdoc-response

6.2.1.  Document Response

  The CGI script can return a document to the user in a document
  response, with an optional error code indicating the success status
  of the response.

     document-response = Content-Type [ Status ] *other-field NL
                         response-body





Robinson & Coar              Informational                     [Page 23]

RFC 3875                    CGI Version 1.1                 October 2004


  The script MUST return a Content-Type header field.  A Status header
  field is optional, and status 200 'OK' is assumed if it is omitted.
  The server MUST make any appropriate modifications to the script's
  output to ensure that the response to the client complies with the
  response protocol version.

6.2.2.  Local Redirect Response

  The CGI script can return a URI path and query-string
  ('local-pathquery') for a local resource in a Location header field.
  This indicates to the server that it should reprocess the request
  using the path specified.

     local-redir-response = local-Location NL

  The script MUST NOT return any other header fields or a message-body,
  and the server MUST generate the response that it would have produced
  in response to a request containing the URL

     scheme "://" server-name ":" server-port local-pathquery

6.2.3.  Client Redirect Response

  The CGI script can return an absolute URI path in a Location header
  field, to indicate to the client that it should reprocess the request
  using the URI specified.

     client-redir-response = client-Location *extension-field NL

  The script MUST not provide any other header fields, except for
  server-defined CGI extension fields.  For an HTTP client request, the
  server MUST generate a 302 'Found' HTTP response message.

6.2.4.  Client Redirect Response with Document

  The CGI script can return an absolute URI path in a Location header
  field together with an attached document, to indicate to the client
  that it should reprocess the request using the URI specified.

     client-redirdoc-response = client-Location Status Content-Type
                                *other-field NL response-body

  The Status header field MUST be supplied and MUST contain a status
  value of 302 'Found', or it MAY contain an extension-code, that is,
  another valid status code that means client redirection.  The server
  MUST make any appropriate modifications to the script's output to
  ensure that the response to the client complies with the response
  protocol version.



Robinson & Coar              Informational                     [Page 24]

RFC 3875                    CGI Version 1.1                 October 2004


6.3.  Response Header Fields

  The response header fields are either CGI or extension header fields
  to be interpreted by the server, or protocol-specific header fields
  to be included in the response returned to the client.  At least one
  CGI field MUST be supplied; each CGI field MUST NOT appear more than
  once in the response.  The response header fields have the syntax:

     header-field    = CGI-field | other-field
     CGI-field       = Content-Type | Location | Status
     other-field     = protocol-field | extension-field
     protocol-field  = generic-field
     extension-field = generic-field
     generic-field   = field-name ":" [ field-value ] NL
     field-name      = token
     field-value     = *( field-content | LWSP )
     field-content   = *( token | separator | quoted-string )

  The field-name is not case sensitive.  A NULL field value is
  equivalent to a field not being sent.  Note that each header field in
  a CGI-Response MUST be specified on a single line; CGI/1.1 does not
  support continuation lines.  Whitespace is permitted between the ":"
  and the field-value (but not between the field-name and the ":"), and
  also between tokens in the field-value.

6.3.1.  Content-Type

  The Content-Type response field sets the Internet Media Type [6] of
  the entity body.

     Content-Type = "Content-Type:" media-type NL

  If an entity body is returned, the script MUST supply a Content-Type
  field in the response.  If it fails to do so, the server SHOULD NOT
  attempt to determine the correct content type.  The value SHOULD be
  sent unmodified to the client, except for any charset parameter
  changes.

  Unless it is otherwise system-defined, the default charset assumed by
  the client for text media-types is ISO-8859-1 if the protocol is HTTP
  and US-ASCII otherwise.  Hence the script SHOULD include a charset
  parameter.  See section 3.4.1 of the HTTP/1.1 specification [4] for a
  discussion of this issue.








Robinson & Coar              Informational                     [Page 25]

RFC 3875                    CGI Version 1.1                 October 2004


6.3.2.  Location

  The Location header field is used to specify to the server that the
  script is returning a reference to a document rather than an actual
  document (see sections 6.2.3 and 6.2.4).  It is either an absolute
  URI (optionally with a fragment identifier), indicating that the
  client is to fetch the referenced document, or a local URI path
  (optionally with a query string), indicating that the server is to
  fetch the referenced document and return it to the client as the
  response.

     Location        = local-Location | client-Location
     client-Location = "Location:" fragment-URI NL
     local-Location  = "Location:" local-pathquery NL
     fragment-URI    = absoluteURI [ "#" fragment ]
     fragment        = *uric
     local-pathquery = abs-path [ "?" query-string ]
     abs-path        = "/" path-segments
     path-segments   = segment *( "/" segment )
     segment         = *pchar
     pchar           = unreserved | escaped | extra
     extra           = ":" | "@" | "&" | "=" | "+" | "$" | ","

  The syntax of an absoluteURI is incorporated into this document from
  that specified in RFC 2396 [2] and RFC 2732 [7].  A valid absoluteURI
  always starts with the name of scheme followed by ":"; scheme names
  start with a letter and continue with alphanumerics, "+", "-" or ".".
  The local URI path and query must be an absolute path, and not a
  relative path or NULL, and hence must start with a "/".

  Note that any message-body attached to the request (such as for a
  POST request) may not be available to the resource that is the target
  of the redirect.

6.3.3.  Status

  The Status header field contains a 3-digit integer result code that
  indicates the level of success of the script's attempt to handle the
  request.

     Status         = "Status:" status-code SP reason-phrase NL
     status-code    = "200" | "302" | "400" | "501" | extension-code
     extension-code = 3digit
     reason-phrase  = *TEXT

  Status code 200 'OK' indicates success, and is the default value
  assumed for a document response.  Status code 302 'Found' is used
  with a Location header field and response message-body.  Status code



Robinson & Coar              Informational                     [Page 26]

RFC 3875                    CGI Version 1.1                 October 2004


  400 'Bad Request' may be used for an unknown request format, such as
  a missing CONTENT_TYPE.  Status code 501 'Not Implemented' may be
  returned by a script if it receives an unsupported REQUEST_METHOD.

  Other valid status codes are listed in section 6.1.1 of the HTTP
  specifications [1], [4], and also the IANA HTTP Status Code Registry
  [8] and MAY be used in addition to or instead of the ones listed
  above.  The script SHOULD check the value of SERVER_PROTOCOL before
  using HTTP/1.1 status codes.  The script MAY reject with error 405
  'Method Not Allowed' HTTP/1.1 requests made using a method it does
  not support.

  Note that returning an error status code does not have to mean an
  error condition with the script itself.  For example, a script that
  is invoked as an error handler by the server should return the code
  appropriate to the server's error condition.

  The reason-phrase is a textual description of the error to be
  returned to the client for human consumption.

6.3.4.  Protocol-Specific Header Fields

  The script MAY return any other header fields that relate to the
  response message defined by the specification for the SERVER_PROTOCOL
  (HTTP/1.0 [1] or HTTP/1.1 [4]).  The server MUST translate the header
  data from the CGI header syntax to the HTTP header syntax if these
  differ.  For example, the character sequence for newline (such as
  UNIX's US-ASCII LF) used by CGI scripts may not be the same as that
  used by HTTP (US-ASCII CR followed by LF).

  The script MUST NOT return any header fields that relate to
  client-side communication issues and could affect the server's
  ability to send the response to the client.  The server MAY remove
  any such header fields returned by the client.  It SHOULD resolve any
  conflicts between header fields returned by the script and header
  fields that it would otherwise send itself.

6.3.5.  Extension Header Fields

  There may be additional implementation-defined CGI header fields,
  whose field names SHOULD begin with "X-CGI-".  The server MAY ignore
  (and delete) any unrecognised header fields with names beginning "X-
  CGI-" that are received from the script.








Robinson & Coar              Informational                     [Page 27]

RFC 3875                    CGI Version 1.1                 October 2004


6.4.  Response Message-Body

  The response message-body is an attached document to be returned to
  the client by the server.  The server MUST read all the data provided
  by the script, until the script signals the end of the message-body
  by way of an end-of-file condition.  The message-body SHOULD be sent
  unmodified to the client, except for HEAD requests or any required
  transfer-codings, content-codings or charset conversions.

     response-body = *OCTET

7.  System Specifications

7.1.  AmigaDOS

  Meta-Variables
     Meta-variables are passed to the script in identically named
     environment variables.  These are accessed by the DOS library
     routine GetVar().  The flags argument SHOULD be 0.  Case is
     ignored, but upper case is recommended for compatibility with
     case-sensitive systems.

  The current working directory
     The current working directory for the script is set to the
     directory containing the script.

  Character set
     The US-ASCII character set [9] is used for the definition of
     meta-variables, header fields and values; the newline (NL)
     sequence is LF; servers SHOULD also accept CR LF as a newline.

7.2.  UNIX

  For UNIX compatible operating systems, the following are defined:

  Meta-Variables
     Meta-variables are passed to the script in identically named
     environment variables.  These are accessed by the C library
     routine getenv() or variable environ.

  The command line
     This is accessed using the argc and argv arguments to main().  The
     words have any characters which are 'active' in the Bourne shell
     escaped with a backslash.

  The current working directory
     The current working directory for the script SHOULD be set to the
     directory containing the script.



Robinson & Coar              Informational                     [Page 28]

RFC 3875                    CGI Version 1.1                 October 2004


  Character set
     The US-ASCII character set [9], excluding NUL, is used for the
     definition of meta-variables, header fields and CHAR values; TEXT
     values use ISO-8859-1.  The PATH_TRANSLATED value can contain any
     8-bit byte except NUL.  The newline (NL) sequence is LF; servers
     should also accept CR LF as a newline.

7.3.  EBCDIC/POSIX

  For POSIX compatible operating systems using the EBCDIC character
  set, the following are defined:

  Meta-Variables
     Meta-variables are passed to the script in identically named
     environment variables.  These are accessed by the C library
     routine getenv().

  The command line
     This is accessed using the argc and argv arguments to main().  The
     words have any characters which are 'active' in the Bourne shell
     escaped with a backslash.

  The current working directory
     The current working directory for the script SHOULD be set to the
     directory containing the script.

  Character set
     The IBM1047 character set [21], excluding NUL, is used for the
     definition of meta-variables, header fields, values, TEXT strings
     and the PATH_TRANSLATED value.  The newline (NL) sequence is LF;
     servers should also accept CR LF as a newline.

  media-type charset default
     The default charset value for text (and other implementation-
     defined) media types is IBM1047.

8.  Implementation

8.1.  Recommendations for Servers

  Although the server and the CGI script need not be consistent in
  their handling of URL paths (client URLs and the PATH_INFO data,
  respectively), server authors may wish to impose consistency.  So the
  server implementation should specify its behaviour for the following
  cases:

     1. define any restrictions on allowed path segments, in particular
        whether non-terminal NULL segments are permitted;



Robinson & Coar              Informational                     [Page 29]

RFC 3875                    CGI Version 1.1                 October 2004


     2. define the behaviour for "." or ".." path segments; i.e.,
        whether they are prohibited, treated as ordinary path segments
        or interpreted in accordance with the relative URL
        specification [2];

     3. define any limits of the implementation, including limits on
        path or search string lengths, and limits on the volume of
        header fields the server will parse.

8.2.  Recommendations for Scripts

  If the script does not intend processing the PATH_INFO data, then it
  should reject the request with 404 Not Found if PATH_INFO is not
  NULL.

  If the output of a form is being processed, check that CONTENT_TYPE
  is "application/x-www-form-urlencoded" [18] or "multipart/form-data"
  [16].  If CONTENT_TYPE is blank, the script can reject the request
  with a 415 'Unsupported Media Type' error, where supported by the
  protocol.

  When parsing PATH_INFO, PATH_TRANSLATED or SCRIPT_NAME the script
  should be careful of void path segments ("//") and special path
  segments ("." and "..").  They should either be removed from the path
  before use in OS system calls, or the request should be rejected with
  404 'Not Found'.

  When returning header fields, the script should try to send the CGI
  header fields as soon as possible, and should send them before any
  HTTP header fields.  This may help reduce the server's memory
  requirements.

  Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST
  meta-variables (see sections 4.1.8 and 4.1.9) may not identify the
  ultimate source of the request.  They identify the client for the
  immediate request to the server; that client may be a proxy, gateway,
  or other intermediary acting on behalf of the actual source client.

9.  Security Considerations

9.1.  Safe Methods

  As discussed in the security considerations of the HTTP
  specifications [1], [4], the convention has been established that the
  GET and HEAD methods should be 'safe' and 'idempotent' (repeated
  requests have the same effect as a single request).  See section 9.1
  of RFC 2616 [4] for a full discussion.




Robinson & Coar              Informational                     [Page 30]

RFC 3875                    CGI Version 1.1                 October 2004


9.2.  Header Fields Containing Sensitive Information

  Some HTTP header fields may carry sensitive information which the
  server should not pass on to the script unless explicitly configured
  to do so.  For example, if the server protects the script by using
  the Basic authentication scheme, then the client will send an
  Authorization header field containing a username and password.  The
  server validates this information and so it should not pass on the
  password via the HTTP_AUTHORIZATION meta-variable without careful
  consideration.  This also applies to the Proxy-Authorization header
  field and the corresponding HTTP_PROXY_AUTHORIZATION meta-variable.

9.3.  Data Privacy

  Confidential data in a request should be placed in a message-body as
  part of a POST request, and not placed in the URI or message headers.
  On some systems, the environment used to pass meta-variables to a
  script may be visible to other scripts or users.  In addition, many
  existing servers, proxies and clients will permanently record the URI
  where it might be visible to third parties.

9.4.  Information Security Model

  For a client connection using TLS, the security model applies between
  the client and the server, and not between the client and the script.
  It is the server's responsibility to handle the TLS session, and thus
  it is the server which is authenticated to the client, not the CGI
  script.

  This specification provides no mechanism for the script to
  authenticate the server which invoked it.  There is no enforced
  integrity on the CGI request and response messages.

9.5.  Script Interference with the Server

  The most common implementation of CGI invokes the script as a child
  process using the same user and group as the server process.  It
  should therefore be ensured that the script cannot interfere with the
  server process, its configuration, documents or log files.

  If the script is executed by calling a function linked in to the
  server software (either at compile-time or run-time) then precautions
  should be taken to protect the core memory of the server, or to
  ensure that untrusted code cannot be executed.







Robinson & Coar              Informational                     [Page 31]

RFC 3875                    CGI Version 1.1                 October 2004


9.6.  Data Length and Buffering Considerations

  This specification places no limits on the length of the message-body
  presented to the script.  The script should not assume that
  statically allocated buffers of any size are sufficient to contain
  the entire submission at one time.  Use of a fixed length buffer
  without careful overflow checking may result in an attacker
  exploiting 'stack-smashing' or 'stack-overflow' vulnerabilities of
  the operating system.  The script may spool large submissions to disk
  or other buffering media, but a rapid succession of large submissions
  may result in denial of service conditions.  If the CONTENT_LENGTH of
  a message-body is larger than resource considerations allow, scripts
  should respond with an error status appropriate for the protocol
  version; potentially applicable status codes include 503 'Service
  Unavailable' (HTTP/1.0 and HTTP/1.1), 413 'Request Entity Too Large'
  (HTTP/1.1), and 414 'Request-URI Too Large' (HTTP/1.1).

  Similar considerations apply to the server's handling of the CGI
  response from the script.  There is no limit on the length of the
  header or message-body returned by the script; the server should not
  assume that statically allocated buffers of any size are sufficient
  to contain the entire response.

9.7.  Stateless Processing

  The stateless nature of the Web makes each script execution and
  resource retrieval independent of all others even when multiple
  requests constitute a single conceptual Web transaction.  Because of
  this, a script should not make any assumptions about the context of
  the user-agent submitting a request.  In particular, scripts should
  examine data obtained from the client and verify that they are valid,
  both in form and content, before allowing them to be used for
  sensitive purposes such as input to other applications, commands, or
  operating system services.  These uses include (but are not limited
  to) system call arguments, database writes, dynamically evaluated
  source code, and input to billing or other secure processes.  It is
  important that applications be protected from invalid input
  regardless of whether the invalidity is the result of user error,
  logic error, or malicious action.

  Authors of scripts involved in multi-request transactions should be
  particularly cautious about validating the state information;
  undesirable effects may result from the substitution of dangerous
  values for portions of the submission which might otherwise be
  presumed safe.  Subversion of this type occurs when alterations are
  made to data from a prior stage of the transaction that were not
  meant to be controlled by the client (e.g., hidden HTML form
  elements, cookies, embedded URLs, etc.).



Robinson & Coar              Informational                     [Page 32]

RFC 3875                    CGI Version 1.1                 October 2004


9.8.  Relative Paths

  The server should be careful of ".." path segments in the request
  URI.  These should be removed or resolved in the request URI before
  it is split into the script-path and extra-path.  Alternatively, when
  the extra-path is used to find the PATH_TRANSLATED, care should be
  taken to avoid the path resolution from providing translated paths
  outside an expected path hierarchy.

9.9.  Non-parsed Header Output

  If a script returns a non-parsed header output, to be interpreted by
  the client in its native protocol, then the script must address all
  security considerations relating to that protocol.

10.  Acknowledgements

  This work is based on the original CGI interface that arose out of
  discussions on the 'www-talk' mailing list.  In particular, Rob
  McCool, John Franks, Ari Luotonen, George Phillips and Tony Sanders
  deserve special recognition for their efforts in defining and
  implementing the early versions of this interface.

  This document has also greatly benefited from the comments and
  suggestions made Chris Adie, Dave Kristol and Mike Meyer; also David
  Morris, Jeremy Madea, Patrick McManus, Adam Donahue, Ross Patterson
  and Harald Alvestrand.

11.  References

11.1  Normative References

  [1]  Berners-Lee, T., Fielding, R. and H. Frystyk, "Hypertext
       Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996.

  [2]  Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource
       Identifiers (URI) : Generic Syntax", RFC 2396, August 1998.

  [3]  Bradner, S., "Key words for use in RFCs to Indicate Requirements
       Levels", BCP 14, RFC 2119, March 1997.

  [4]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
       Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol --
       HTTP/1.1", RFC 2616, June 1999.

  [5]  Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S.,
       Leach, P., Luotonen, A., and L. Stewart, "HTTP Authentication:
       Basic and Digest Access Authentication", RFC 2617, June 1999.



Robinson & Coar              Informational                     [Page 33]

RFC 3875                    CGI Version 1.1                 October 2004


  [6]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
       Extensions (MIME) Part Two: Media Types", RFC 2046, November
       1996.

  [7]  Hinden, R., Carpenter, B., and L. Masinter, "Format for Literal
       IPv6 Addresses in URL's", RFC 2732, December 1999.

  [8]  "HTTP Status Code Registry",
       http://www.iana.org/assignments/http-status-codes, IANA.

  [9]  "Information Systems -- Coded Character Sets -- 7-bit American
       Standard Code for Information Interchange (7-Bit ASCII)", ANSI
       INCITS.4-1986 (R2002).

  [10] "Information technology -- 8-bit single-byte coded graphic
       character sets -- Part 1: Latin alphabet No. 1", ISO/IEC
       8859-1:1998.

11.2.  Informative References

  [11] Berners-Lee, T., "Universal Resource Identifiers in WWW: A
       Unifying Syntax for the Expression of Names and Addresses of
       Objects on the Network as used in the World-Wide Web", RFC 1630,
       June 1994.

  [12] Braden, R., Ed., "Requirements for Internet Hosts -- Application
       and Support", STD 3, RFC 1123, October 1989.

  [13] Crocker, D., "Standard for the Format of ARPA Internet Text
       Messages", STD 11, RFC 822, August 1982.

  [14] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", RFC
       2246, January 1999.

  [15] Hinden R. and S. Deering, "Internet Protocol Version 6 (IPv6)
       Addressing Architecture", RFC 3513, April 2003.

  [16] Masinter, L., "Returning Values from Forms:
       multipart/form-data", RFC 2388, August 1998.

  [17] Mockapetris, P., "Domain Names - Concepts and Facilities", STD
       13, RFC 1034, November 1987.

  [18] Raggett, D., Le Hors, A., and I. Jacobs, Eds., "HTML 4.01
       Specification", W3C Recommendation December 1999,
       http://www.w3.org/TR/html401/.

  [19] Rescola, E. "HTTP Over TLS", RFC 2818, May 2000.



Robinson & Coar              Informational                     [Page 34]

RFC 3875                    CGI Version 1.1                 October 2004


  [20] St. Johns, M., "Identification Protocol", RFC 1413, February
       1993.

  [21] IBM National Language Support Reference Manual Volume 2,
       SE09-8002-01, March 1990.

  [22] "The Common Gateway Interface",
       http://hoohoo.ncsa.uiuc.edu/cgi/, NCSA, University of Illinois.

12.  Authors' Addresses

  David Robinson
  The Apache Software Foundation

  EMail: [email protected]


  Ken A. L. Coar
  The Apache Software Foundation

  EMail: [email protected]






























Robinson & Coar              Informational                     [Page 35]

RFC 3875                    CGI Version 1.1                 October 2004


13.  Full Copyright Statement

  Copyright (C) The Internet Society (2004).  This document is subject
  to the rights, licenses and restrictions contained in BCP 78 and at
  www.rfc-editor.org, and except as set forth therein, the authors
  retain all their rights.

  This document and the information contained herein are provided on an
  "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
  OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
  ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
  INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
  INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
  WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

  Intellectual Property

  The IETF takes no position regarding the validity or scope of any
  Intellectual Property Rights or other rights that might be claimed to
  pertain to the implementation or use of the technology described in
  this document or the extent to which any license under such rights
  might or might not be available; nor does it represent that it has
  made any independent effort to identify any such rights.  Information
  on the ISOC's procedures with respect to rights in ISOC Documents can
  be found in BCP 78 and BCP 79.

  Copies of IPR disclosures made to the IETF Secretariat and any
  assurances of licenses to be made available, or the result of an
  attempt made to obtain a general license or permission for the use of
  such proprietary rights by implementers or users of this
  specification can be obtained from the IETF on-line IPR repository at
  http://www.ietf.org/ipr.

  The IETF invites any interested party to bring to its attention any
  copyrights, patents or patent applications, or other proprietary
  rights that may cover technology that may be required to implement
  this standard.  Please address the information to the IETF at ietf-
  [email protected].

Acknowledgement

  Funding for the RFC Editor function is currently provided by the
  Internet Society.








Robinson & Coar              Informational                     [Page 36]