Network Working Group                                         A. Bhushan
Request for Comments: 310                                        MIT-MAC
NIC: 9261                                                  April 3, 1972


           Another Look At Data And File Transfer Protocols

  Our experience with ad hoc techniques of data and file transfer over
  the ARPANET together with a better knowledge of terminal IMP (TIP)
  capabilities and Datacomputer requirements has indicated to us that
  the Data Transfer Protocol (DTP) (see ref 1) and the File Transfer
  Protocol (FTP) (see ref 2) could undergo revision.  Our effort in
  implementing DTP and FTP has revealed areas in which the protocols
  could be simplified without degrading their usefulness.

  This paper suggests some specific changes in DTP and FTP that should
  make them more useful and/or simplify implementation.  The attempt
  here is to stimulate thinking so that we may come up with a better
  protocol at the forthcoming Data and File Transfer Workshop (see ref
  3).

Experience to Date

  A number of ad hoc techniques of transmitting data and files across
  the ARPANET already exist.  Perhaps, the most versatile of these
  existing methods is the TENEX "CPYNET" system.  The "CPYNET" system
  uses an ad hoc or interim file transfer protocol developed by Ray
  Tomlinson and others at BBN to transmit files among the TENEX systems
  on the ARPANET. [Private Communication with Bill Crowther, BBN.]

  In CPYNET, the using process goes through the Initial Connection
  Protocol (ICP) to server socket 7, establishing a full-duplex
  connection with an 8-bit byte size.  Control information, including
  user name, password, command (read, write, or append), file name, and
  byte size for the data connection is transmitted from the using
  process to the serving process.  The original full-duplex connection
  is then closed, and a new full-duplex connection is established using
  the original socket numbers but with possibly a different byte size.
  The file is now transmitted on this newly established connection.
  The end-of-file is indicated by closing the connection (the mode of
  transfer is thus similar to DTP "indefinite bit-stream").

  CPYNET has been used quite extensively for transfer of TENEX system
  files.  Because data is not reformatted, and because the optimum
  connection byte size may be used for data transfer, CPYNET is quite
  efficient.  The PDP-10 (and there are quite a lot in the ARPANET)
  works more efficiently with a 36 bit byte size which minimizes
  packing and unpacking of data, and increases effective I/O speed



Bhushan                                                         [Page 1]

RFC 310               Another Look At Data And FTP            April 1972


  (bit rate is 36 times the I/O word transfer rate instead of 8 times).
  The closing and reopening of connections does increase overhead but
  this is small in TENEX when compared with inefficiency introduced in
  data transfer using an inappropriate byte size.

  Data and file transfer has been achieved at other sites by a simple
  modification of the user TELNET to enable the transfer of ASCII files
  as terminal I/O data streams within the constraints of the TELNET
  protocol.  An example of this approach is the use of the "send.file"
  and "script" features within the MIT-DMCG user-TELNET.  Send.file
  enables the PDP-10 (DMCG) user to transmit his local ASCII files to a
  receiving process such as an editor at the remote host via a TELNET
  connection.  The program allows for a variable buffer size for
  transmission, and measures the transfer rate of information bits.
  Script enables a user to receive an ASCII file from a remote host by
  essentially printing it out (the terminal output stream is directed
  to a local file).

  Our initial experience with the use of send.file program has affirmed
  the almost linear relationship between buffer size and transmission
  rate (inverse relationship to processing cost) until the limits
  imposed by allocates, NCP sending buffers, the IMP message size, or
  the receiving process speed, are reached.  Our experiments have
  indicated that TELNET processes in which the receiving process
  "looks" at each character are slow and expensive.  The transfer rate
  to most TELNET receiving processes ranges between 200 and 2,000 bits
  per second.  The NCP-to-NCP transmission rate however is an order-
  of-magnitude higher (2,000 to 20,000 bits per second).

  A variation of the above method which avoids the character-by-
  character processing of TELNET, is transmitting files via auxiliary
  connections (other than the TELNET connections) with or without the
  use of DTP.  We are collecting data on response times, connect times
  and transfer speeds employing different transfer and buffering
  strategies.

TIP Capabilities and TIP Users

  It appears now that TIPs will not support DTP in its present form.
  The more elaborate TIPs with magnetic tape units will however,
  support the DTP block mode (descriptor and counts) [Private
  Communication with Bill Crowther, BBN.]  It is inconvenient, at the
  very least, for a TIP user to use services based on DTP (such as
  remote job service, file transfer, mail, and Datacomputer).  The TIP
  philosophy is that "the computational load and storage should be in
  the hosts or in the terminals and not in the terminal processor."
  (See ref 4.) To be consistent with this philosophy the protocols
  should be simple and convenient to use from the user viewpoint.



Bhushan                                                         [Page 2]

RFC 310               Another Look At Data And FTP            April 1972


  Ideally, TIP users would like to connect (using the initial
  connection protocol) to the advertised service socket (including
  logger socket1) in the remote host and type their commands in a
  uniform, easy to use, format.  Allowing the use of ASCII within DTP
  would facilitate this.  (An alternate approach is extending TELNET to
  include DTP modes, particularly the indefinite bit-stream mode.)
  Another step would be to use printable ASCII strings instead of
  numeric codes for commands and arguments in user-level protocols.
  Use of standard file system commands (with uniform interpretation and
  format) will lead towards the existence of a Network Virtual File
  System, much in the same line as Network Virtual Terminal defined in
  TELNET protocol.

  The transparent mode in DTP was specifically included to allow
  convenient use by TIPs.  Since the TIPs will not support transparent
  mode, it makes sense to do away with it.  This change would lead to a
  simplier DTP which allows transfer in Block mode, and the indefinite
  bit-stream mode.  (The suggested default which would be acceptable to
  all including the TIPs, as it involves no overhead.).  We can then
  make optional or do away with the now mandatory modes available
  handshake.  The using process can indicate if it also accepts the
  block mode for transfer.  (Either by modes available transaction, or
  by an argument in the command string).  The server should accept
  input in DTP mode as well as ASCII.  These fundamental changes in DTP
  will make communication with TIPs a lot easier.

  TIP users who do not have a mediating user-FTP process and a file
  system in their TIP, would probably want to transfer files from input
  devices or to output devices such as line printer, card reader or
  punch, or magnetic tape.  These devices "listen" on specific "ports"
  or sockets on a TIP.  It would be desirable to modify FTP to allow
  sending data to a specified socket in a specified mode and type.  TIP
  users would then find it convenient to obtain listing of their files
  on a high-speed line printer, input their files from a card reader,
  and keep back-up on cards or magnetic tapes.

Datacomputer Requirements

  We have been having a continuing dialogue with CCA personnel (Dick
  Winter in particular), regarding CCA's plans for data and file
  transfer on the Datacomputer, and their specific requirements.  Dick










Bhushan                                                         [Page 3]

RFC 310               Another Look At Data And FTP            April 1972


  Winter will be speaking on this subject at the Data and File Transfer
  Workshop.  This is an attempt to summarize the main points of our
  discussion, and their implication for data and file transfer.

  First, CCA appears quite flexible at this stage regarding the manner
  in which Datacomputer is to be used.  Even the Datalanguage (see ref
  5) is flexible and can undergo change.  However, CCA would like some
  changes in the current file transfer protocol and its envisioned use.

  Ideally, CCA would like to see a single full-duplex connection for
  transfer of all control information which is in Datalanguage.  This
  information is generated by a process, which may be a user at a
  console, or a user program.  Ability to inter-mix data and control
  information would be definite advantage.  The Datacomputer would
  probably support DTP and allow use of TELNET-ASCII.

  Data may alternatively be sent to or received from a separate user
  defined port (which may be a socket).  It would be advantageous if a
  user could instruct the Datacomputer to transfer data to or from a
  file in remote system via FTP (assuming a server-FTP in remote
  system).  CCA is currently not committed to this idea, but is
  considering it.

  In the CCA view, the Datacomputer represents a data management
  facility with Datalanguage as its command language.  From the
  viewpoint of Datacomputer as an FTP server, FTP commands be a subset
  of the Datalanguage.  It is therefore desirable that FTP commands be
  printable ASCII strings instead of numeric codes.

Remote Job Service Requirements

  Initially two separate protocols were proposed for Remote Job Service
  (RJS).  One was the NETRJS protocol (see ref 6) for remote job
  service from large Hosts and the other was the NETRJT Protocol (see
  ref 7) for remote job service from TIPs (and other mini-Hosts).  The
  current thinking however, is to move towards a single RJS with "as
  much overlap as possible between the methods of dealing with these
  two user populations."  (See ref 8.)  Perhaps inclusion of ASCII
  within DTP would make this feasible.

  The existing proposals for DTP and FTP have been considered somewhat
  less than optimal for RJS needs.  Specific drawbacks of DTP and FTP
  have been pointed out in the handling of data structures and data
  types.  Most of these problems seem relatively easy to resolve.  It
  would involve making Network ASCII the default data type (acceptable
  to all hosts) and providing a way in FTP for proposing and rejecting
  alternative data types and data structures.




Bhushan                                                         [Page 4]

RFC 310               Another Look At Data And FTP            April 1972


  Another inadequacy of FTP (which pertains to other applications as
  well) is in the area of error recovery.  Currently there is no way to
  "restart" transmission if an element in the transmission path fails.
  One solution suggested has involved the use of sequence number (see
  ref 9).  A number of other solutions exist to the problem.  These are
  discussed later in the section 'FTP Reconsidered'.

DTP Reconsidered

  The aspiration for DTP was that it would provide a uniform mechanism
  for separating information into its logical structure (records,
  files, and control), and rudimentary error control.  The evaluation
  of DTP and its modes should be on the basis of speed (real-time),
  efficiency (processing cost), reliability (error control and
  recovery), and the ease of its use.

  It is now clear that unless DTP was significantly revised, the TIP
  and other mini-Host user would find it difficult to use services
  based on use of DTP.  Allowing the use of ASCII within DTP, and using
  defaults instead of the "modes available" handshake, would alleviate
  this problem, but compromise the DTP error control function.  Whether
  error control belongs at the DTP level or at a higher level needs
  further discussion.

  DTP, in its present form, does not provide sufficient error control
  and recovery procedures.  To make DTP more useful, either it should
  be simplified (at least from a user viewpoint), or it should be
  extended to include better error control with built in error
  recovery, and possible handling of data types and data structures.

  In the simplified version, DTP would only be a format procedure in
  which data could be transmitted as ASCII (no format) with escape to
  an 8-bit transparent (indefinite bit-stream) mode or in data blocks
  (descriptor and count mode).  The choice of which mode to use, and
  all error control, error recovery, and aborts would be handled by the
  higher-level protocol.

  The utility of the block mode in data transfer has been questioned by
  many who suggest that it puts a large overhead for providing the
  simple function of indicating end-of-file, and separating data and
  control information.  The alternative data transfer strategy is to
  use separate connections for control and data information and/or
  close and reopen connections.  This causes an overhead of a different
  sort, but has the advantage that the byte size for connection may be
  chosen to optimize data transfer.






Bhushan                                                         [Page 5]

RFC 310               Another Look At Data And FTP            April 1972


  A drawback of present DTP is that it is geared to transfer of 8-bit
  bytes.  Perhaps a good strategy for data transfer would be to allow
  sending data in an agreed upon transfer mode.  The transfer mode
  chosen should determine the byte size for connection, the data type
  chosen, and any data structure information.  This mode may be chosen
  at the DTP level, or at the using protocol level.

FTP Reconsidered

  The aspiration for FTP was that it would facilitate file management
  and file transfer in the ARPANET Virtual File System.  FTP success
  should be evaluated by the extent of its use, convenience and
  efficiency in its use, and its suitability for other applications
  such as Datacomputer, RJS, and Mail.

  Wide use of FTP would be possible if a user could use an FTP-server
  directly without the help of a mediating DTP/FTP-User process.  This
  would require that commands be ASCII strings instead of numeric
  codes, and that ASCII characters be an acceptable input.  Such a
  change in FTP would greatly increase its acceptance at the cost of
  making the server-implementation more complex.  Combined
  implementation, however, would be simplified as the mediating FTP-
  user process (if used at all) would be simplified.

  Efficiency of transfer is an important factor affecting the
  usefulness of FTP.  File transfer may be very expensive (in terms of
  CPU time) and slow (in real-time) if an inappropriate transfer
  strategy is used (e.g., inappropriate byte size).  Every attempt
  should be made to optimize transfer of data.  A good strategy may be
  to allow transfer of files over a separate connection or close and
  reopen connections (using perhaps a different byte size).  A problem
  with indicating end-of-file by closing connection is that is
  uncertain if the connection was closed because end-of-file was
  reached, or because of a failure or error condition.  Perhaps "NCP
  interrupts" could be used in addition to a "close" to indicate
  definite end-of-file condition.

  A drawback in the present FTP strategy is that it has no restart
  procedure.  One proposal for restart has involved the use of the
  sequence numbers used in DTP block mode.  Our feeling is that perhaps
  restart may best be accomplished by incorporating a command in FTP
  that allows a user to specify the place in file where to begin
  retransmission.  A possible solution is to use the "SPF" command
  implemented in the UCSB Simple-Minded File System (see ref 10).
  Another solution may be to have optional arguments for retrieve and
  store commands that allow selective retrieval and replacement
  (specified by bits, character, words, lines, pages or segments).




Bhushan                                                         [Page 6]

RFC 310               Another Look At Data And FTP            April 1972


  Another useful addition to FTP would be a protocol procedure between
  user and server to agree to data type, data structure, and mode for
  file transfer.  This would enable the user and server to reach the
  optimum file transfer strategy acceptable to both.

Concluding Remarks

  We have discussed in this paper what we see as the major problem
  areas in the present DTP and FTP specifications.  We hope this
  discussion will stimulate thinking, so that we can arrive at revised
  specifications for DTP and FTP that satisfy all the diverse needs in
  an elegant manner.

REFERENCES

     1. The Data Transfer Protocol, Bhushan, et al, NWG/RFC #264, NIC
  #7212.

     2. The File Transfer Protocol, Bhushan, et al, NWG/RFC #265, NIC
  #7213.

     3. Data and File Transfer Workshop Announcement, A. Bhushan,
  NWG/RFC #309, NIC #9260.

     4. The Terminal IMP for the ARPA Compuer Network, Ornstein, et al,
  SJCC, 1972, NIC #8218.

     5. Datalanguage, Computer Operation of America, Datacomputer
  Project, Working Paper No.3, October 29, 1971, NIC #8208.

     6. Interim NETRJS Specifications, R. T. Braden, NWG/RFC #189, NIC
  #7133.

     7. NETRJT - - Remote Job Service Protocol for TIPs, R. T. Braden,
  NWG/RFC #283, NIC #8165.

     8. RJS Protocol Meeting Notes, 25 February 1972, A. McKenzie
  (limited distribution).

     9. A Suggested Addition to File Transfer Protocol, A. McKenzie,
  NWG/RFC #281, NIC #8163.

     10. Network Specifications for UCSB's Simple-Minded Files System,
  James E. White, NWG/RFC #122, NIC #5834

       [This RFC was put into machine readable form for entry]
    [into the online RFC archives by Hélène Morin, Viagénie 10/99]




Bhushan                                                         [Page 7]