Internet Engineering Task Force (IETF)                     R. Alimi, Ed.
Request for Comments: 6392                                        Google
Category: Informational                                   A. Rahman, Ed.
ISSN: 2070-1721                         InterDigital Communications, LLC
                                                           Y. Yang, Ed.
                                                        Yale University
                                                           October 2011


                A Survey of In-Network Storage Systems

Abstract

  This document surveys deployed and experimental in-network storage
  systems and describes their applicability for the DECADE (DECoupled
  Application Data Enroute) architecture.

Status of This Memo

  This document is not an Internet Standards Track specification; it is
  published for informational purposes.

  This document is a product of the Internet Engineering Task Force
  (IETF).  It represents the consensus of the IETF community.  It has
  received public review and has been approved for publication by the
  Internet Engineering Steering Group (IESG).  Not all documents
  approved by the IESG are a candidate for any level of Internet
  Standard; see Section 2 of RFC 5741.

  Information about the current status of this document, any errata,
  and how to provide feedback on it may be obtained at
  http://www.rfc-editor.org/info/rfc6392.

Copyright Notice

  Copyright (c) 2011 IETF Trust and the persons identified as the
  document authors.  All rights reserved.

  This document is subject to BCP 78 and the IETF Trust's Legal
  Provisions Relating to IETF Documents
  (http://trustee.ietf.org/license-info) in effect on the date of
  publication of this document.  Please review these documents
  carefully, as they describe your rights and restrictions with respect
  to this document.  Code Components extracted from this document must
  include Simplified BSD License text as described in Section 4.e of
  the Trust Legal Provisions and are provided without warranty as
  described in the Simplified BSD License.




Alimi, et al.                 Informational                     [Page 1]

RFC 6392                      DECADE Survey                 October 2011


Table of Contents

  1. Introduction ....................................................3
  2. Survey Overview .................................................3
     2.1. Terminology and Concepts ...................................3
     2.2. Historical Context .........................................3
  3. In-Network Storage System Components ............................5
     3.1. Data Access Interface ......................................5
     3.2. Data Management Operations .................................5
     3.3. Data Search Capability .....................................6
     3.4. Access Control Authorization ...............................6
     3.5. Resource Control Interface .................................6
     3.6. Discovery Mechanism ........................................7
     3.7. Storage Mode ...............................................7
  4. In-Network Storage Systems ......................................7
     4.1. Amazon S3 ..................................................7
     4.2. BranchCache ................................................9
     4.3. Cache-and-Forward Architecture ............................11
     4.4. Cloud Data Management Interface ...........................12
     4.5. Content Delivery Network ..................................14
     4.6. Delay-Tolerant Network ....................................16
     4.7. Named Data Networking .....................................18
     4.8. Network of Information ....................................19
     4.9. Network Traffic Redundancy Elimination ....................22
     4.10. OceanStore ...............................................23
     4.11. P2P Cache ................................................24
     4.12. Photo Sharing ............................................26
     4.13. Usenet ...................................................28
     4.14. Web Cache ................................................29
     4.15. Observations Regarding In-Network Storage Systems ........31
  5. Storage and Other Related Protocols ............................32
     5.1. HTTP ......................................................32
     5.2. iSCSI .....................................................33
     5.3. NFS .......................................................34
     5.4. OAuth .....................................................36
     5.5. WebDAV ....................................................37
     5.6. Observations Regarding Storage and Related Protocols ......39
  6. Conclusions ....................................................40
  7. Security Considerations ........................................40
  8. Contributors ...................................................40
  9. Acknowledgments ................................................41
  10. Informative References ........................................41









Alimi, et al.                 Informational                     [Page 2]

RFC 6392                      DECADE Survey                 October 2011


1.  Introduction

  DECADE (DECoupled Application Data Enroute) is an architecture that
  provides applications with access to provider-based in-network
  storage for content distribution (hereafter referred to as only
  "in-network storage" in this document).  With access to in-network
  storage, content distribution applications can be designed to place
  less load on network infrastructure.  As a simple example, a peer of
  a Peer-to-Peer (P2P) application may upload to other peers through
  its in-network storage, saving its usage of last-mile uplink
  bandwidth.  See [1] for further discussion.

  A major motivation for DECADE is the substantial increase in capacity
  and reduction in cost offered by storage systems.  For example, over
  the last two decades, there has been at least a 30-fold increase in
  the amount of storage that a customer can get for a given price (for
  flash memory and hard disk drives) [2] [3] [4].

  High-capacity and low-cost in-network storage devices introduce
  substantial opportunities.  One example of in-network storage is
  content caches supporting Web and P2P content.  DECADE differs from
  existing content caches whose control fully resides with the owners
  of the caching devices in that DECADE also allows applications to
  control access to their allocated in-network storage, as well as the
  resources consumed while accessing that storage (bandwidth,
  connections, storage space).  While designed in the context of P2P
  applications, DECADE may be useful to other applications as well.
  This document provides details on deployed and experimental
  in-network storage solutions, and evaluates their suitability for
  DECADE.

  We note that the survey presented in this document is only
  representative of the research in this area.  Rather than trying to
  enumerate an exhaustive list, we have chosen some typical techniques
  that lead to derivative works.

2.  Survey Overview

2.1.  Terminology and Concepts

  This document uses terms defined in [1].

2.2.  Historical Context

  In-network storage has been used previously in numerous scenarios to
  reduce network traffic and enable more efficient content
  distribution.  This section presents a brief history of content
  distribution techniques and illustrates how DECADE relates to past



Alimi, et al.                 Informational                     [Page 3]

RFC 6392                      DECADE Survey                 October 2011


  approaches.  Systems have been developed with particular use cases in
  mind.  Thus, this survey is not meant to point out shortcomings of
  existing solutions, but rather to indicate where certain capabilities
  required in DECADE [5] are not provided by existing systems.

  In the early stage of Internet development, most Web content was
  stored at a central server, and clients requested Web content from
  the central server.  In this architecture, the central server was
  required to provide a large amount of bandwidth.  As more and more
  users access Web content, a central server can become overloaded.
  The use of Web caches is one technique to reduce load on a central
  server.  Web caches store frequently requested content and provide
  bandwidth for serving the content to clients.

  The ongoing growth of broadband technology in the worldwide market
  has been driven by the hunger of customers for new multimedia
  services as well as Web content.  In particular, the use of audio and
  video streaming formats has become common for delivery of rich
  information to the public, both residential and business.

  To overcome this challenge of massive multimedia consumption, just
  installing more Web caches will not be enough.  Moving content closer
  to the consumer results in greater network efficiency, improved
  Quality of Service (QoS), and lower latency, while facilitating
  personalization of content through broadband content applications.
  In these edge technologies, Content Delivery Networks (CDNs) are a
  representative technique.  CDNs are based on a large-scale
  distributed network of servers located closer to customers for
  efficient delivery of digital content, including various forms of
  multimedia content.

  Although CDNs are an effective means of information access and
  delivery, there are two barriers to making CDNs a more common
  service: cost and replication integrity.  Deploying a CDN with its
  associated infrastructure is expensive.  A CDN also requires
  administrative control over nodes with large storage capacity at
  geographically dispersed locations with adequate connectivity.  CDNs
  can be scalable, but due to this administrative and cost overhead,
  they are not rapidly deployable for the common user.

  The emergence and maturation of P2P has allowed improvements to many
  network applications.  P2P allows the use of client resources, such
  as CPU, memory, storage, and bandwidth, for serving content.  This
  can reduce the amount of resources required by a content provider.
  Multimedia content delivery using various P2P or peer-assisted
  frameworks has been shown to greatly reduce the dependence on CDNs
  and central content servers.  However, the popularity of P2P
  applications has resulted in increased traffic on ISP networks.  P2P



Alimi, et al.                 Informational                     [Page 4]

RFC 6392                      DECADE Survey                 October 2011


  caches (both transparent and non-transparent) have been introduced as
  a way to reduce the burden.  Though they can be effective in reducing
  traffic in certain areas of ISP networks, P2P caches have their
  shortcomings.  In particular, they are application-dependent and thus
  difficult to keep up to date with new and evolving P2P application
  protocols.  Second, applications may benefit from explicit control of
  in-network storage, which P2P caches do not provide.  See [1] for
  further discussion.

  DECADE aims to provide a standard protocol allowing P2P applications
  (including content providers) to make use of in-network storage to
  reduce the traffic burden on ISP networks, while enabling P2P
  applications to control access to content they have placed in
  in-network storage.

3.  In-Network Storage System Components

  Before surveying individual technologies, we describe the basic
  components of in-network storage.  For consistency and for ease of
  comparison, we use the same model to evaluate each storage technology
  in this document.

  Note that the network protocol(s) used by a given storage system are
  also an important part of the design.  We omit details of particular
  protocol choices in this document.

3.1.  Data Access Interface

  A set of operations is made available to a user for accessing data in
  the in-network storage system.  Solutions typically allow both read
  and write operations, though the mechanisms for doing so can differ
  drastically.

3.2.  Data Management Operations

  Storage systems may provide users the ability to manage stored
  content.  For example, operations such as delete and move may be
  provided to users.  In this survey, we focus on data management
  operations that are provided to users and omit those provided to
  system administrators.











Alimi, et al.                 Informational                     [Page 5]

RFC 6392                      DECADE Survey                 October 2011


3.3.  Data Search Capability

  Some storage systems may provide the capability to search or
  enumerate content that has been stored.  In this survey, we focus on
  search capabilities that are provided to users and omit those
  provided to system administrators.  An example of a search would be
  to find the list of items stored by a given user over a given period
  of time.

3.4.  Access Control Authorization

  Storage systems typically allow a user, content owner, or some other
  entity to define the access policies for the in-network storage
  system.  The in-network storage system then checks the authorization
  of a user before it stores or retrieves content.  We define three
  types of access control authorization: public-unrestricted, public-
  restricted, and private.

  "Public-unrestricted" refers to content on an in-network storage
  system that is widely available to all clients (i.e., without
  restrictions).  An example is accessing Wikipedia on the Web, or
  anonymous access to FTP sites.

  "Public-restricted" refers to content on an in-network storage system
  that is available to a restricted (though still potentially large)
  set of clients, but that does not require any confidential
  credentials from the client.  An example is some content (e.g., a TV
  show episode) on the Internet that can only be viewable in selected
  countries or networks (i.e., white/black lists or black-out areas).

  "Private" refers to content on an in-network storage system that is
  only made available to one or more clients presenting the required
  confidential credentials (e.g., password or key).  This content is
  not available to anyone without the proper confidential access
  credentials.

  Note that a combination of access control types may be applicable for
  a given scenario.  For example, the retrieval (read) of content from
  an in-network storage system may be public-unrestricted, but the
  storage (write) to the same system may be private.

3.5.  Resource Control Interface

  This is the interface through which users manage the resources on
  in-network storage systems that can be used by other peers, e.g., the
  bandwidth or connections.  The storage system may also allow users to
  indicate a time for which resources are granted.




Alimi, et al.                 Informational                     [Page 6]

RFC 6392                      DECADE Survey                 October 2011


3.6.  Discovery Mechanism

  Users use the discovery mechanism to find the location of in-network
  storage, find an access interface or resource control interface, or
  find other interfaces of in-network storage.

3.7.  Storage Mode

  Storage systems may use the following modes of storage: file system,
  object-based, or block-based.

  A file system typically organizes files into a hierarchical tree
  structure.  Each level of the hierarchy normally contains zero or
  more directories, each with zero or more files.  A file system may
  also be flat or use some other organizing principle.

  We define an object-based storage mode as one that stores discrete
  chunks of data (e.g., IP datagrams or another type of aggregation
  useful to an application) without a pre-defined hierarchy or
  meta-structure.

  We define a block-based storage mode as one that stores a raw
  sequence of bytes, with a client being able to read and/or write data
  at offsets within that sequence.  Data is typically accessed in
  blocks for efficiency.  A common example for this storage mode is raw
  access to a hard disk.

  In this survey, we define "storage mode" to refer to how data is
  structured within the system, which may not be the same as how it is
  accessed by a client.  For example, a caching system may cache
  objects with hierarchical names, but may internally use an object-
  based storage mode.

4.  In-Network Storage Systems

  This section surveys in-network storage systems using the methodology
  defined above.  The survey includes some systems that are widely
  deployed today, some systems that are just being deployed, and some
  experimental systems.  The survey covers both traditional client-
  server architectures and P2P architectures.  The surveyed systems are
  listed in alphabetical order.  Also, for each system, a brief
  explanation of the relevance to DECADE is given.

4.1.  Amazon S3

  Amazon S3 (Simple Storage Service) [6] provides an online storage
  service using Web (HTTP) interfaces.  Users create buckets, and each
  bucket can contain stored objects.  Users are provided an interface



Alimi, et al.                 Informational                     [Page 7]

RFC 6392                      DECADE Survey                 October 2011


  through which they can manage their buckets.  Amazon S3 is a popular
  backend storage service for other services.  Other related storage
  services are the Blob Service provided by Windows Azure [7], Google
  Storage for Developers [8], and Dropbox [9].

4.1.1.  Applicability to DECADE

  Amazon S3 is a very widely used (deployed) example of in-network
  storage.  Amazon S3 leases the storage to third-party companies for
  disparate services.  In particular, Amazon S3 has a rich model for
  authorization (using signed queries) to integrate with a wide variety
  of use cases.  A focus for Amazon S3 is scalability.  Particular
  simplifications that were made are the absence of a general,
  hierarchical namespace and the inability to update the contents of
  existing data.

4.1.2.  Data Access Interface

  Users can read and write objects.

4.1.3.  Data Management Operations

  Users can delete previously stored objects.

4.1.4.  Data Search Capability

  Users can list contents of buckets to find objects matching desired
  criteria.

4.1.5.  Access Control Authorization

  All methods of access control are supported for clients: public-
  unrestricted, public-restricted, and private.

  For example, access to stored objects can be restricted by an owner,
  a list of other Amazon S3 Web Service users, or all Amazon S3 Web
  Service users; or can be open to all users (anonymous access).
  Another option is for the owner to generate and sign a query (e.g., a
  query to read an object) that can be used by any user until an owner-
  defined expiration time.

4.1.6.  Resource Control Interface

  Not provided.







Alimi, et al.                 Informational                     [Page 8]

RFC 6392                      DECADE Survey                 October 2011


4.1.7.  Discovery Mechanism

  Users are provided a well-known DNS name (either a default provided
  by Amazon S3, or one customized by a particular user).  Users
  accessing S3 storage use DNS to discover an IP address where S3
  requests can be sent.

4.1.8.  Storage Mode

  Object-based, with the extension that objects can be organized into
  user-defined buckets.

4.2.  BranchCache

  BranchCache [10] is a feature integrated into Windows (Windows 7 and
  Windows Server 2008R2) that aims to optimize enterprise branch office
  file access over WAN links.  The main goals are to reduce WAN link
  utilization and improve application responsiveness by caching and
  sharing content within a branch while still maintaining end-to-end
  security.  BranchCache allows files retrieved from the Web servers
  and file servers located in headquarters or data centers to be cached
  in remote branch offices, and shared among users in the same branch
  accessing the same content.  BranchCache operates transparently by
  instrumenting the HTTP and Server Message Block (SMB) components of
  the networking stack.  It provides two modes of operation:
  Distributed Cache and Hosted Cache.

  In both modes, a client always contacts a BranchCache-enabled content
  server first to get the content identifiers for local search.  If the
  content is cached locally, the client then retrieves the content
  within the branch.  Otherwise, the client will go back to the
  original content server to request the content.  The two modes differ
  in how the content is shared.

  In the Hosted Cache mode, a locally provisioned server acts as a
  cache for files retrieved from the servers.  After getting the
  content identifiers, the client first consults the cache for the
  desired file.  If it is not present in the cache, the client
  retrieves it from the content server and sends it to the cache for
  storage.

  In the Distributed Cache mode, a client first queries other clients
  in the same network using the Web Services Discovery multicast
  protocol [11].  As in the Hosted Cache mode, the client retrieves the
  file from the content server if it is not available locally.  After
  retrieving the file (either from another client or the content
  server), the client stores the file locally.




Alimi, et al.                 Informational                     [Page 9]

RFC 6392                      DECADE Survey                 October 2011


  The original content server always authorizes requests from clients.
  Cached content is encrypted such that clients can decrypt the data
  only using keys derived from metadata returned by the content server.
  In addition to instrumenting the networking stack at clients, content
  servers must also support BranchCache.

4.2.1.  Applicability to DECADE

  BranchCache is an example of an in-network storage system primarily
  targeted at enterprise networks.  It supports a P2P-like mode
  (Distributed Cache) as well as a client-server mode (Hosted Cache).
  Integration into the Microsoft OS will ensure wide distribution of
  this in-network storage technology.

4.2.2.  Data Access Interface

  Clients transparently retrieve (read) data from a cache (on a client
  or a Hosted Cache), since BranchCache operates by instrumenting the
  networking stack.  In the Hosted Cache mode, clients write data to
  the Hosted Cache once it is retrieved from the content server.

4.2.3.  Data Management Operations

  Not provided.

4.2.4.  Data Search Capability

  Not provided.

4.2.5.  Access Control Authorization

  The access control method for clients is private.  For example,
  transferred content is encrypted, and can only be decrypted by keys
  derived from data received from the original content server.  Though
  data may be transferred to unauthorized clients, end-to-end security
  is maintained by only allowing authorized clients to decrypt the
  data.

4.2.6.  Resource Control Interface

  The storage capacity of caches on the clients and Hosted Caches is
  configurable by system administrators.  The Hosted Cache further
  allows configuration of the maximum number of simultaneous client
  accesses.  In the Distributed Cache mode, exponential back-off and
  throttling mechanisms are utilized to prevent reply storms of popular
  content requests.  The client will also spread data-block access
  among multiple serving clients that have the content (complete or
  partial) to improve latency and provide some load balancing.



Alimi, et al.                 Informational                    [Page 10]

RFC 6392                      DECADE Survey                 October 2011


4.2.7.  Discovery Mechanism

  The Distributed Cache mode uses multicast for discovery of other
  clients and content within a local network.  Currently, the Hosted
  Cache mode uses policy provisioning or manual configuration of the
  server used as the Hosted Cache.  In this mode, the address of the
  server may be found via DNS.

4.2.8.  Storage Mode

  Object-based.

4.3.  Cache-and-Forward Architecture

  Cache-and-Forward (CNF) [12] is an architecture for content delivery
  services for the future Internet.  In this architecture, storage can
  be exploited on nodes within the network, either directly on routers
  or deployed near the routers.  CNF is based on the concept of store-
  and-forward routers with large storage, providing for opportunistic
  delivery to occasionally disconnected mobile users and for in-network
  caching of content.  The proposed CNF protocol uses reliable hop-by-
  hop transfer of large data files between CNF routers in place of an
  end-to-end transport protocol such as TCP.

4.3.1.  Applicability to DECADE

  CNF is an example of an experimental in-network storage system that
  would require storage space on (or near) a large number of routers in
  the Internet if it was deployed.  As the name of the system implies,
  it would provide short-term caching and not long-term network
  storage.

4.3.2.  Data Access Interface

  Users implicitly store content at CNF routers by requesting files.
  End hosts read content from in-network storage by submitting queries
  for content.

4.3.3.  Data Management Operations

  Not provided.

4.3.4.  Data Search Capability

  Not provided.






Alimi, et al.                 Informational                    [Page 11]

RFC 6392                      DECADE Survey                 October 2011


4.3.5.  Access Control Authorization

  The access control method is public-restricted (to any client that is
  part of the CNF network).

4.3.6.  Resource Control Interface

  Not provided.

4.3.7.  Discovery Mechanism

  A query including a location-independent content ID is sent to the
  network and routed to a CNF router, which handles retrieval of the
  data and forwarding to the end host.

4.3.8.  Storage Mode

  Object-based, with objects representing individual files.  The
  architecture proposes to cache large files in storage within the
  network, though objects could be made to represent smaller chunks of
  larger files.

4.4.  Cloud Data Management Interface

  The Cloud Data Management Interface (CDMI) is a specification to
  access and manage cloud storage.  CDMI is specified by the Storage
  Networking Industry Association (SNIA).

  CDMI is a functional interface that applications can use to create,
  retrieve, update, and delete data elements from the cloud.  As part
  of this interface, the client will be able to discover the
  capabilities of the cloud storage offering and use this interface to
  manage containers and the data that is placed in them.  In addition,
  metadata can be set on containers and their contained data elements
  through this interface [13].

  CDMI follows a traditional client-server model, and operates over an
  HTTP interface using the Representational State Transfer (REST)
  model.  Similar to Amazon S3 buckets (see Section 4.1), users may
  create containers in which data objects may be stored.  Even though
  data objects may be accessed via a user-defined name within a
  container, it is also possible to access data objects via a storage-
  defined Object ID, which is provided in the response upon creation of
  a data object.







Alimi, et al.                 Informational                    [Page 12]

RFC 6392                      DECADE Survey                 October 2011


4.4.1.  Applicability to DECADE

  CDMI is an important initiative to standardize storage interfaces for
  cloud services, which are rapidly becoming an important type of
  storage service.  In particular, it specifies a set of operations for
  creating, reading, writing, and managing data objects at a remote
  server (or set of servers) via HTTP.

4.4.2.  Data Access Interface

  Users can read and write data objects, and also update data in
  existing data objects.  CDMI data objects are sent on the wire
  embedded as a field in a JavaScript Object Notation (JSON) object.
  The protocol also defines interfaces in which the contents of data
  objects can be written via simple HTTP GET/PUT operations.

4.4.3.  Data Management Operations

  Users can delete already-existing data objects.  The create operation
  also supports modes in which the created object is copied or moved
  from an existing data object.

  Data system metadata also allows users to configure policies
  regarding time-to-live, after which a data object is automatically
  deleted, as well as the redundancy with which a data object is
  stored.

4.4.4.  Data Search Capability

  Users may list the contents of containers to locate data objects
  matching any desired criteria.

4.4.5.  Access Control Authorization

  All methods of access control for clients are supported: public-
  unrestricted, public-restricted, and private.

  In particular, CDMI allows access to data objects to be protected by
  Access Control Lists (ACLs) that can allow or restrict access based
  on user name, group, administrative status, or whether a user is
  authenticated or anonymous.

4.4.6.  Resource Control Interface

  CDMI supports attributes 'cdmi_max_latency' and 'cdmi_max_throughput'
  (set at either the level of containers, or a specific data object),
  which control the level of service offered to any users accessing a
  particular data object.



Alimi, et al.                 Informational                    [Page 13]

RFC 6392                      DECADE Survey                 October 2011


4.4.7.  Discovery Mechanism

  Users are provided a well-known DNS name.  The DNS name is resolved
  to determine the IP address to which requests may be sent.

4.4.8.  Storage Mode

  Object-based, with the extension that objects can be organized into
  user-defined containers.

4.5.  Content Delivery Network

  A CDN provides services that improve performance by minimizing the
  amount of data transmitted through the network, improving
  accessibility, and maintaining correctness through content
  replication.  CDNs offer fast and reliable applications and services
  by distributing content to cache or edge servers located close to
  users.  See [14] for an additional taxonomy and survey.

  A CDN has some combination of content delivery, request routing,
  distribution, and accounting infrastructures.  The content-delivery
  infrastructure consists of a set of edge servers (also called
  surrogates) that deliver copies of content to end users.  The
  request-routing infrastructure is responsible for directing client
  requests to appropriate edge servers.  It also interacts with the
  distribution infrastructure to keep an up-to-date view of the content
  stored in the CDN caches.  The distribution infrastructure moves
  content from the origin server to the CDN edge servers and ensures
  consistency of content in the caches.  The accounting infrastructure
  maintains logs of client accesses and records the usage of the CDN
  servers.  This information is used for traffic reporting and usage-
  based billing.

  In practice, a CDN typically hosts static content including images,
  video, media clips, advertisements, and other embedded objects for
  Web viewing.  A focus for CDNs is the ability to publish and deliver
  content to end users in a reliable and timely manner.  A CDN focuses
  on building its network infrastructure to provide the following
  services and functionalities: storage and management of content;
  distribution of content among surrogates; cache management; delivery
  of static, dynamic, and streaming content; backup and disaster
  recovery solutions; and monitoring, performance measurement, and
  reporting.

  Examples of existing CDNs are Akamai, Limelight, and CloudFront.






Alimi, et al.                 Informational                    [Page 14]

RFC 6392                      DECADE Survey                 October 2011


  The following description uses the term "content provider" to refer
  to the entity purchasing a CDN service, and the term "client" to
  refer to the subscriber requesting content via the CDN from the
  content provider.

4.5.1.  Applicability to DECADE

  CDNs are a very widely used (deployed) example of in-network storage
  for multimedia content.  The existence and operation of the storage
  system are totally transparent to the end user.  CDNs typically
  require a strong business relationship between the content providers
  and content distributors, and often the business relationship extends
  to the ISPs.

4.5.2.  Data Access Interface

  A CDN is typically a closed system, and generally provides only a
  read (retrieve) access interface to clients.  A CDN typically does
  not provide a write (store) access interface to clients.  The content
  provider can access network edge servers and store content on them,
  or edge servers can retrieve content from content providers.  Client
  nodes can only retrieve content from edge servers.

4.5.3.  Data Management Operations

  A content provider can manage the data distributed in different cache
  nodes, such as moving popular data objects from one cache node to
  another cache node, or deleting rarely accessed data objects in cache
  nodes.  User nodes, however, have no right to perform these
  operations.

4.5.4.  Data Search Capability

  A content provider can search or enumerate the data each cache node
  stores.  User nodes cannot perform search operations.

4.5.5.  Access Control Authorization

  All methods of access control (for reading) are supported for
  clients: public-unrestricted, public-restricted, and private.  Some
  CDN edge servers allow usage of HTTP basic authentication with the
  origin server or restrictions by IP address, or they can use a token-
  based technique to allow the origin server to apply its own
  authorization criteria.

  As mentioned previously, clients typically cannot write to the CDN.
  Writing is typically a private operation for the content providers.




Alimi, et al.                 Informational                    [Page 15]

RFC 6392                      DECADE Survey                 October 2011


4.5.6.  Resource Control Interface

  Not provided.

4.5.7.  Discovery Mechanism

  Content providers can directly find internal CDN cache nodes to store
  content, since they typically have an explicit business relationship.
  Clients can locate CDN nodes through DNS or other redirection
  mechanisms.

4.5.8.  Storage Mode

  Though the addressing of objects uses URLs that typically refer to
  objects in a hierarchical fashion, the storage mode is typically
  object-based.

4.6.  Delay-Tolerant Network

  The Delay-Tolerant Network (DTN) [15] is an evolution of an
  architecture originally designed for the Interplanetary Internet.
  The Interplanetary Internet is a communication system envisioned to
  provide Internet-like services across interplanetary distances in
  support of deep space exploration.  The DTN architecture can be
  utilized in various operational environments characterized by severe
  communication disruptions, disconnections, and high delays (e.g., a
  month-long loss of connectivity between two planetary networks
  because of high solar radiation due to sun spots).  The DTN
  architecture is thus suitable for environments including deep space
  networks, sensor-based networks, certain satellite networks, and
  underwater acoustic networks.

  A key aspect of the DTN is a store-and-forward overlay layer called
  the "Bundle Protocol" or "Bundle Layer", which exists between the
  transport and application layers [16].  The Bundle Layer forms a
  logical overlay that employs persistent storage to help combat long-
  term network interruptions by providing a store-and-forward service.
  While traditional IP networks are also based on store-and-forward
  principles, the amount of time of a packet being kept in "storage" at
  a traditional IP router is typically on the order of milliseconds (or
  less).  In contrast, the DTN architecture assumes that most Bundle
  Layer nodes will use some form of persistent storage (e.g., hard
  disk, flash memory, etc.) for DTN packets because of the nature of
  the DTN environment.







Alimi, et al.                 Informational                    [Page 16]

RFC 6392                      DECADE Survey                 October 2011


4.6.1.  Applicability to DECADE

  The DTN is an example of an experimental in-network storage system
  that would require fundamental changes to the Internet protocols.

4.6.2.  Data Access Interface

  Users implicitly cause content to be stored (until successfully
  forwarded) at Bundle Layer nodes by initiating/terminating any
  transaction that traverses the DTN.

4.6.3.  Data Management Operations

  Users can implicitly cause deletion of content stored at Bundle Layer
  nodes via a "time-to-live" type of parameter that the user can
  control (for transactions originating from the user).

4.6.4.  Data Search Capability

  Not provided.

4.6.5.  Access Control Authorization

  The access control method is public-restricted (to any client that is
  part of the DTN) or private.

4.6.6.  Resource Control Interface

  Not provided.

4.6.7.  Discovery Mechanism

  A Uniform Resource Identifier (URI) approach is used as the basis of
  the addressing scheme for DTN transactions (and subsequent store-and-
  forward routing through the DTN network).

4.6.8.  Storage Mode

  Object-based.  DTN applications send data to the Bundle Layer, which
  then breaks the data into segments.  These segments are then routed
  through the DTN network, and stored in Bundle Layer nodes as required
  (before being forwarded).









Alimi, et al.                 Informational                    [Page 17]

RFC 6392                      DECADE Survey                 October 2011


4.7.  Named Data Networking

  Named Data Networking (NDN) [17] is a research initiative that
  proposes to move to a new model of addressing and routing for the
  Internet.  NDN uses "named data"-based routing and forwarding, to
  replace the current IP-address-based model.  NDN also uses name-based
  data caching in the routers.

  Each NDN Data packet will be assigned a content name and will be
  cryptographically signed.  Data delivery is driven by the requesting
  end.  Routers disseminate name-based prefix announcements by using
  routing protocols such as Intermediate System to Intermediate System
  (IS-IS) or the Border Gateway Protocol (BGP).  The requester will
  send out an "Interest" packet, which identifies the name of the data
  that it wants.  Routers that receive this Interest packet will
  remember the interface it came from and will then forward it on a
  name-based routing protocol.  Once an Interest packet reaches a node
  that has the desired data, a named Data packet is sent back, which
  carries both the name and content of the data, along with a digital
  signature of the producer.  This named Data packet is then forwarded
  back to the original requester on the reverse path of the Interest
  packet [18].

  A key aspect of NDN is that routers have the capability to cache the
  named data.  If a request for the same data (i.e., same name) comes
  to the router, then the NDN router will forward the named data stored
  locally to fulfill the request.  The proponents of NDN believe that
  the network can be designed naturally, matching data delivery
  characteristics instead of communication between endpoints, because
  data delivery has become the primary use of the network.

4.7.1.  Applicability to DECADE

  NDN is an example of an experimental in-network storage system that
  would require storage space on a large number of routers in the
  Internet.  Named Data packets would be kept in storage in the NDN
  routers and provided to new requesters of the same data.

4.7.2.  Data Access Interface

  Users implicitly store content at NDN routers by requesting content
  (the named Data packets) from the network.  Subsequent requests by
  different users for the same content will cause the named Data
  packets to be read from the NDN routers' in-network storage.







Alimi, et al.                 Informational                    [Page 18]

RFC 6392                      DECADE Survey                 October 2011


4.7.3.  Data Management Operations

  Users do not have the direct ability to delete content stored in the
  NDN routers.  However, there will be some type of time-to-live
  parameter associated with the named Data packets, though this has not
  yet been specified.

4.7.4.  Data Search Capability

  Not provided.

4.7.5.  Access Control Authorization

  All methods of access control for clients are supported: public-
  unrestricted, public-restricted, and private.

  The basic security mechanism in NDN is for the sender to digitally
  sign the content (the named Data packets) that it sends.  It is
  envisioned that a complete access control system can be built on top
  of NDN, though this has not yet been specified.

4.7.6.  Resource Control Interface

  Not provided.

4.7.7.  Discovery Mechanism

  Names are used as the basis of the addressing and discovery scheme
  for NDN (and subsequent store-and-forward routing through the NDN
  network).  NDN names are assumed to be hierarchical and to be able to
  be deterministically constructed.  This is still an active area of
  research.

4.7.8.  Storage Mode

  Object-based.  NDN sends named Data packets through the network.
  These Data packets are routed through the NDN network and stored in
  NDN routers.

4.8.  Network of Information

  Similar to NDN (see Section 4.7), Network of Information (NetInf)
  [19] is another information-centric approach in which the named data
  objects are the basic component of the networking architecture.
  NetInf is thus moving away from today's host-centric networking






Alimi, et al.                 Informational                    [Page 19]

RFC 6392                      DECADE Survey                 October 2011


  architecture where the nodes in the network are the primary objects.
  In today's network, the information objects are named relative to the
  hosts they are stored on (e.g.,
  http://www.example.com/information-object.txt).

  The NetInf naming and security framework builds the foundation for an
  information-centric security model that integrates security deeply
  into the architecture.  In this model, trust is based on the
  information itself.  Information objects (IOs) are given a unique
  name with cryptographic properties.  Together with additional
  metadata, the name can be used to verify the data integrity as well
  as several other security properties, such as self-certification,
  name persistency, and owner authentication and identification.  The
  approach also gives some benefits over the security model in today's
  host-centric networks, as it minimizes the need for trust in the
  infrastructure, including the hosts providing the data, the channel,
  or the resolution service.

  In NetInf, the information objects are published into the network.
  They are registered with a Name Resolution Service (NRS).  The NRS is
  also used to register network locators that can be used to retrieve
  data objects that represent the published IOs.  When a receiver wants
  to retrieve an IO, the request for the IO is resolved by the NRS into
  a set of locators.  These locators are then used to retrieve a copy
  of the data object from the "best" available source(s).  NetInf is
  open to use any type of underlying transport network.  The locators
  can thus be a heterogeneous set, e.g., IPv4, IPv6, Medium Access
  Control (MAC), etc.

  NetInf will make extensive use of caching of information objects in
  the network and will provide network functionality that is similar to
  what overlay solutions such as CDNs and P2P distribution networks
  (e.g., BitTorrent) provide today.

4.8.1.  Applicability to DECADE

  NetInf is an example of an experimental information-centric network
  architecture that will require storage space for storage and caching
  of information objects on a large number of NetInf nodes in the
  Internet.











Alimi, et al.                 Informational                    [Page 20]

RFC 6392                      DECADE Survey                 October 2011


4.8.2.  Data Access Interface

  Users will publish IOs with specific IDs into the network.  This is
  done by the client sending a register message to the NRS stating that
  the IO with the specific ID is available.  When another user wishes
  to retrieve the IO, they will use the given ID to make a request for
  the IO.  The ID is then resolved by the NRS, and the IO is delivered
  from a nearby in-network storage location.

4.8.3.  Data Management Operations

  Users do not have the direct ability to delete content stored in the
  NetInf nodes.  However, there can be some type of time-to-live
  parameter associated with the information objects, though this has
  not yet been specified.

4.8.4.  Data Search Capability

  Not provided.

4.8.5.  Access Control Authorization

  All methods of access control for clients are supported: public-
  unrestricted, public-restricted, and private.  The basic security
  mechanism in NetInf is for the publisher to digitally sign the
  content of the information object that it publishes.  It is
  envisioned that a complete access control system can be built on top
  of NetInf, though this has not yet been specified.

4.8.6.  Resource Control Interface

  Not provided.

4.8.7.  Discovery Mechanism

  NetInf IDs are used for naming and accessing information objects.
  The IDs are resolved by the NRS into locators that are used for
  routing and transport of data through the transport networks.  This
  is still an active area of research.

4.8.8.  Storage Mode

  Object-based.  From an application perspective, NetInf can be used
  for publishing entire files or chunks of files.  NetInf is agnostic
  to the application perspective and treats everything as information
  objects.





Alimi, et al.                 Informational                    [Page 21]

RFC 6392                      DECADE Survey                 October 2011


4.9.  Network Traffic Redundancy Elimination

  Redundancy Elimination (RE) is used for identifying and removing
  repeated content from network transfers.  This technique has been
  proposed to improve network performance in many types of networks,
  such as ISP backbones and enterprise access links.  One example of an
  RE proposal is SmartRE [20], proposed by Anand et al., which focuses
  on network-wide RE.  In packet-level RE, forwarding elements are
  equipped with additional storage that can be used to cache data from
  forwarded packets.  Upstream routers may replace packet data with a
  fingerprint that tells a downstream router how to decode and
  reconstruct the packet based on cached data.

4.9.1.  Applicability to DECADE

  RE is an example of an experimental in-network storage system that
  would require a large amount of associated packet processing at
  routers if it was ever deployed.

4.9.2.  Data Access Interface

  RE is typically transparent to the user.  Writing into storage is
  done by transferring data that has not already been cached.  Storage
  is read when users transmit data identical to previously transmitted
  data.

4.9.3.  Data Management Operations

  Not provided.

4.9.4.  Data Search Capability

  Not provided.

4.9.5.  Access Control Authorization

  The access control method is public-restricted (to any client that is
  part of the RE network).  Note that the content provider still
  retains control over which peers receive the requested data.  The
  returned data is "compressed" as it is transferred within the
  network.

4.9.6.  Resource Control Interface

  Not provided.  The content provider still retains control over the
  rate at which packets are sent to a peer.  The packet size within the
  network may be reduced.




Alimi, et al.                 Informational                    [Page 22]

RFC 6392                      DECADE Survey                 October 2011


4.9.7.  Discovery Mechanism

  No discovery mechanism is necessary.  Routers can use RE without the
  users' knowledge.

4.9.8.  Storage Mode

  Object-based, with "objects" being data from packets transmitted
  within the network.

4.10.  OceanStore

  OceanStore [21] is a storage platform developed at the University of
  California, Berkeley, that provides globally distributed storage.
  OceanStore implements a model where multiple storage providers can
  pool resources together.  Thus, a major focus is on resiliency, self-
  organization, and self-maintenance.

  The protocol is resilient to some storage nodes being compromised by
  utilizing Byzantine agreement and erasure codes to store data at
  primary replicas.

4.10.1.  Applicability to DECADE

  OceanStore is an example of an experimental in-network storage system
  that provides a high degree of network resilience to failure
  scenarios.

4.10.2.  Data Access Interface

  Users may read and write objects.

4.10.3.  Data Management Operations

  Objects may be replaced by newer versions, and multiple versions of
  an object may be maintained.

4.10.4.  Data Search Capability

  Not provided.

4.10.5.  Access Control Authorization

  Provided, but specifics for clients are unclear from the available
  references.






Alimi, et al.                 Informational                    [Page 23]

RFC 6392                      DECADE Survey                 October 2011


4.10.6.  Resource Control Interface

  Not provided.

4.10.7.  Discovery Mechanism

  Users require an entry point into the system in the form of one
  storage node that is part of OceanStore.  If a hostname is provided,
  the address of a storage node may be determined via DNS.

4.10.8.  Storage Mode

  Object-based.

4.11.  P2P Cache

  Caching of P2P traffic is a useful approach to reduce P2P network
  traffic, because objects in P2P systems are mostly immutable and the
  traffic is highly repetitive.  In addition, making use of P2P caches
  does not require changes to P2P protocols and can be deployed
  transparently from clients.

  P2P caches operate similarly to Web caches (Section 4.14) in that
  they temporarily store frequently requested content.  Requests for
  content already stored in the cache can be served from local storage
  instead of requiring the data to be transmitted over expensive
  network links.

  Two types of P2P caches exist: transparent P2P caches and
  non-transparent P2P caches.

  For a transparent cache, once a P2P cache is established, the network
  will transparently redirect P2P traffic to the cache, which either
  serves the file directly or passes the request on to a remote P2P
  user and simultaneously caches that data.  Transparency is typically
  implemented using Deep Packet Inspection (DPI).  DPI products
  identify and pass P2P packets to the P2P caching system so it can
  cache and accelerate the traffic.

  A non-transparent cache appears as a super peer; it explicitly peers
  with other P2P clients.

  To enable operation with existing P2P software, P2P caches directly
  support P2P application protocols.  A large number of P2P protocols
  are used by P2P software and hence are supported by caches, leading
  to higher complexity.  Additionally, these protocols evolve over
  time, and new protocols are introduced.




Alimi, et al.                 Informational                    [Page 24]

RFC 6392                      DECADE Survey                 October 2011


4.11.1.  Applicability to DECADE

  A P2P cache is an example of in-network storage for P2P systems.
  However, unlike DECADE, the existence and operation of the storage
  system are totally transparent to the end user.

4.11.2.  Transparent P2P Caches

4.11.2.1.  Data Access Interface

  The data access interface allows P2P content to be cached (stored)
  and supplied (retrieved) locally such that network traffic is
  reduced, but it is transparent to P2P users, and P2P users implicitly
  use the data access interface (in the form of their native P2P
  application protocol) to store or retrieve content.

4.11.2.2.  Data Management Operations

  Not provided.

4.11.2.3.  Data Search Capability

  Not provided.

4.11.2.4.  Access Control Authorization

  The access control method is typically public-restricted (to any
  client that is part of the P2P channel or swarm).

4.11.2.5.  Resource Control Interface

  Not provided.

4.11.2.6.  Discovery Mechanism

  The use of DPI means that no discovery mechanism is provided to P2P
  users; it is transparent to P2P users.  Since DPI is used to
  recognize P2P applications' private protocols, P2P cache
  implementations must be updated as new applications are added and
  existing protocols evolve.

4.11.2.7.  Storage Mode

  Object-based.  Chunks (typically, the unit of transfer among P2P
  clients) of content are stored in the cache.






Alimi, et al.                 Informational                    [Page 25]

RFC 6392                      DECADE Survey                 October 2011


4.11.3.  Non-Transparent P2P Caches

4.11.3.1.  Data Access Interface

  The data access interface allows P2P content to be cached (stored)
  and supplied (retrieved) locally such that network traffic is
  reduced.  P2P users implicitly store and retrieve from the cache
  using the P2P application's native protocol.

4.11.3.2.  Data Management Operations

  Not provided.

4.11.3.3.  Data Search Capability

  Not provided.

4.11.3.4.  Access Control Authorization

  The access control method is typically public-restricted (to any
  client that is part of the P2P channel or swarm).

4.11.3.5.  Resource Control Interface

  Not provided.

4.11.3.6.  Discovery Mechanism

  A P2P cache node behaves as if it were a normal peer in order to join
  the P2P overlay network.  Other P2P users can find such a cache node
  through an overlay routing mechanism and can interact with it as if
  it were a normal neighbor node.

4.11.3.7.  Storage Mode

  Object-based.  Chunks (typically, the unit of transfer among P2P
  clients) of content are stored in the cache.

4.12.  Photo Sharing

  There are a growing number of popular online photo-sharing (storing)
  systems.  For example, the Kodak Gallery system [22] serves over
  60 million users and stores billions of images [23].  Other well-
  known examples of photo-sharing systems include Flickr [24] and
  ImageShack [25].  There are also a number of popular blogging






Alimi, et al.                 Informational                    [Page 26]

RFC 6392                      DECADE Survey                 October 2011


  services, such as Tumblr [26], that specialize in sharing large
  numbers of photos as well as other multimedia content (e.g., video,
  text, audio, etc.) as part of their service.  All of these in-network
  storage systems utilize both free and paid subscription models.

  Most photo-sharing systems are based on a traditional client-server
  architecture.  However, a minority of systems also offer a P2P mode
  of operation.  The client-server architecture is typically based on
  HTTP, with a browser client and a Web server.

4.12.1.  Applicability to DECADE

  Photo sharing is a very widely used (deployed) example of in-network
  storage where the end user has direct visibility and extensive
  control of the system.  The typical end-user interface is through an
  HTTP-based Web browser.

4.12.2.  Data Access Interface

  Users can read (view) and write (store) photos.

4.12.3.  Data Management Operations

  Users can delete previously stored photos.

4.12.4.  Data Search Capability

  Users can tag photos and/or organize them using sophisticated Web
  photo album generators.  Users can then search for objects (photos)
  matching desired criteria.

4.12.5.  Access Control Authorization

  The access control method for clients is typically either private or
  public-unrestricted.  For example, writing (storing) to a photo blog
  is typically private to the owner of the account.  However, all other
  clients can view (read) the contents of the blog (i.e., public-
  unrestricted).  Some photo-sharing Websites provide private access to
  read photos to allow sharing with a limited set of friends.

4.12.6.  Resource Control Interface

  Not provided.








Alimi, et al.                 Informational                    [Page 27]

RFC 6392                      DECADE Survey                 October 2011


4.12.7.  Discovery Mechanism

  Clients usually log on manually to a central Web page for the service
  and enter the appropriate information to access the desired
  information.  The address to which the client connects is usually
  determined by DNS using the hostname from the provided URL.

4.12.8.  Storage Mode

  File system (file-based).  Photos are usually stored as files.  They
  can then be organized into meta-structures (e.g., albums, galleries,
  etc.) using sophisticated Web photo album generators.

4.13.  Usenet

  Usenet is a distributed Internet-based discussion (message) system.
  The Usenet messages are arranged as a set of "newsgroups" that are
  classified hierarchically by subject.  Usenet information is
  distributed and stored among a large conglomeration of servers that
  store and forward messages to one another in so-called news feeds.
  Individual users may read messages from, and post messages to, a
  local news server typically operated by an ISP.  This local server
  communicates with other servers and exchanges articles with them.  In
  this fashion, the message is copied from server to server and
  eventually reaches every server in the network [27].

  Traditional Usenet as described above operates as a P2P network
  between the servers, and in a client-server architecture between the
  user and their local news server.  The user requires a Usenet client
  to be installed on their computer and a Usenet server account
  (through their ISP).  However, with the rise of Web browsers, the
  Usenet architecture is evolving to be Web-based.  The most popular
  example of this is Google Groups, where Google hosts all the
  newsgroups and client access is via a standard HTTP-based Web
  browser [28].

4.13.1.  Applicability to DECADE

  Usenet is a historically very important and widely used (deployed)
  example of in-network storage in the Internet.  The use of this
  system is rapidly declining, but efforts have been made to preserve
  the stored content for historical purposes.

4.13.2.  Data Access Interface

  Users can read and post (store) messages.





Alimi, et al.                 Informational                    [Page 28]

RFC 6392                      DECADE Survey                 October 2011


4.13.3.  Data Management Operations

  Users sometimes have limited ability to delete messages that they
  previously posted.

4.13.4.  Data Search Capability

  Traditionally, users could manually search through the newsgroups, as
  they are classified hierarchically by subject.  In the newer Web-
  based systems, there is also an automatic search capability based on
  key-word matches.

4.13.5.  Access Control Authorization

  The access control method is either public-unrestricted or private
  (to client members of that newsgroup).

4.13.6.  Resource Control Interface

  Not provided.

4.13.7.  Discovery Mechanism

  Clients usually log on manually to their Usenet accounts.  DNS may be
  used to resolve hostnames to their corresponding addresses.

4.13.8.  Storage Mode

  File system.  Messages are usually stored as files that are then
  organized hierarchically by subject into newsgroups.

4.14.  Web Cache

  Web cache [29] has been widely deployed by many ISPs to reduce
  bandwidth consumption and Web access latency since the late 1990s.  A
  Web cache can cache the Web documents (e.g., HTML pages, images)
  between server and client to reduce bandwidth usage, server load, and
  perceived lag.  A Web cache server is typically shared by many
  clients, and stores copies of documents passing through it;
  subsequent requests may be satisfied from the cache if certain
  conditions are met.

  Another form of cache is a client-side cache, typically implemented
  in Web browsers.  A client-side cache can keep a local copy of all
  pages recently displayed by a browser, and when the user returns to
  one of these Web pages, the local cached copy is reused.





Alimi, et al.                 Informational                    [Page 29]

RFC 6392                      DECADE Survey                 October 2011


  A related protocol for P2P applications to use Web cache is HPTP
  (HTTP-based Peer to Peer) [30].  It proposes sharing chunks of P2P
  files/streams using HTTP with cache-control headers.

4.14.1.  Applicability to DECADE

  Web cache is a very widely used (deployed) example of in-network
  storage for the key Internet application of Web browsing.  The
  existence and operation of the storage system are transparent to the
  end user in most cases.  The content caching time is controlled by
  time-to-live parameters associated with the original content.  The
  principle of Web caching is to speed up Web page reading by using
  (the same) content previously requested by another user to service a
  new user.

4.14.2.  Data Access Interface

  Users explicitly read from a Web cache by making requests, but they
  cannot explicitly write data into it.  Data is implicitly stored in
  the Web cache by requesting content that is not already cached and
  meets policy restrictions of the cache provider.

4.14.3.  Data Management Operations

  Not provided.

4.14.4.  Data Search Capability

  Not provided.

4.14.5.  Access Control Authorization

  The access control method for clients is public-unrestricted.  It is
  important to note that if content is authenticated or encrypted
  (e.g., HTTPS, Secure Socket Layer (SSL)), it will not be cached.
  Also, if the content is flagged as private (vs. public) at the HTTP
  level by the origin server, it will not be cached.

4.14.6.  Resource Control Interface

  Not provided.

4.14.7.  Discovery Mechanism

  Web caches can be transparently deployed between a Web server and Web
  clients, employing DPI for discovery.  Alternatively, Web caches
  could be explicitly discovered by clients using techniques such as
  DNS or manual configuration.



Alimi, et al.                 Informational                    [Page 30]

RFC 6392                      DECADE Survey                 October 2011


4.14.8.  Storage Mode

  Object-based.  Web content is keyed within the cache by HTTP Request
  fields, such as Method, URI, and Headers.

4.15.  Observations Regarding In-Network Storage Systems

  The following observations about the surveyed in-network storage
  systems are made in the context of DECADE as defined by [1].

  The majority of the surveyed systems were designed for client-server
  architectures and do not support P2P.  However, there are some
  important exceptions, especially for some of the newer technologies
  such as BranchCache and P2P cache, that do support a P2P mode of
  operation.

  The P2P cache systems are interesting, since they do not require
  changes to the P2P applications themselves.  However, this is also a
  limitation in that they are required to support each application
  protocol.

  Many of the surveyed systems were designed for caching as opposed to
  long-term network storage.  Thus, during DECADE protocol design, it
  should be carefully considered whether a caching mode should be
  supported in addition to a long-term network storage mode.  There is
  typically a trade-off between providing a caching mode and long-term
  (and usually also reliable) storage with regards to some performance
  metrics.  Note that [1] identifies issues with classical caching from
  a DECADE perspective, such as the fact that P2P caches typically do
  not allow users to explicitly control content stored in the cache.

  Certain components of the surveyed systems are outside of the scope
  of DECADE.  For example, a protocol used for searching across
  multiple DECADE servers is out of scope.  However, applications may
  still be able to implement such functionality if DECADE exposes the
  appropriate primitives.  This has the benefit of keeping the core
  in-network storage systems simple, while permitting diverse
  applications to design mechanisms that meet their own requirements.

  Today, most in-network storage systems follow some variant of the
  authorization model of public-unrestricted, public-restricted, and
  private.  For DECADE, we may need to evolve the authorization model
  to support a resource owner (e.g., end user) authorization, in
  addition to the network authorization.







Alimi, et al.                 Informational                    [Page 31]

RFC 6392                      DECADE Survey                 October 2011


5.  Storage and Other Related Protocols

  This section surveys existing storage and other related protocols, as
  well as comments on the usage of these protocols to satisfy DECADE's
  use cases.  The surveyed protocols are listed alphabetically.

5.1.  HTTP

  HTTP [31] is a key protocol for the World Wide Web.  It is a
  stateless client-server protocol that allows applications to be
  designed using the REST model.  HTTP is often associated with
  downloading (reading) content from Web servers to Web browsers, but
  it also has support for uploading (writing) content to Web servers.
  It has been used as the underlying protocol for other protocols, such
  as Web Distributed Authoring and Versioning (WebDAV).

  HTTP is used in some of the most popular in-network storage systems
  surveyed previously, including CDNs, photo sharing, and Web cache.
  Usage of HTTP by a storage protocol implies that no extra software is
  required in the client (i.e., Web-based client), as all standard Web
  browsers are based on HTTP.

5.1.1.  Data Access Interface

  Basic read and write operations are supported (using HTTP GET, PUT,
  and POST methods).

5.1.2.  Data Management Operations

  Not provided.

5.1.3.  Data Search Capability

  Not provided.

5.1.4.  Access Control Authorization

  All methods of access control for clients are supported: public-
  unrestricted, public-restricted, and private.

  The majority of Web pages are public-unrestricted in terms of reading
  but do not allow any uploading of content.  In-network storage
  systems range from private or public-unrestricted for photo sharing
  (described in Section 4.12.5) to public-unrestricted for Web caching
  (described in Section 4.14.5).






Alimi, et al.                 Informational                    [Page 32]

RFC 6392                      DECADE Survey                 October 2011


5.1.5.  Resource Control Interface

  Not provided.

5.1.6.  Discovery Mechanism

  Manual configuration is typically used.  Clients typically address
  HTTP servers by providing a hostname, which is resolved to an address
  using DNS.

5.1.7.  Storage Mode

  HTTP is a protocol; it thus does not define a storage mode.  However,
  a non-collection resource can typically be thought of as a "file".
  These files may be organized into collections, which typically map
  onto the HTTP path hierarchy, creating the illusion of a file system.

5.1.8.  Comments

  HTTP is based on a client-server architecture and thus is not
  directly applicable for the DECADE focus on P2P.  Also, HTTP offers
  only a rudimentary toolset for storage operations compared to some of
  the other storage protocols.

5.2.  iSCSI

  Small Computer System Interface (SCSI) is a set of protocols enabling
  communication with storage devices such as disk drives and tapes;
  Internet SCSI (iSCSI) [32] is a protocol enabling SCSI commands to be
  sent over TCP.  As in SCSI, iSCSI allows an Initiator to send
  commands to a Target.  These commands operate on the device level as
  opposed to individual data objects stored on the device.

5.2.1.  Data Access Interface

  Read and write commands indicate which data is to be read or written
  by specifying the offset (using Logical Block Addressing) into the
  storage device.  The size of data to be read or written is an
  additional parameter in the command.

5.2.2.  Data Management Operations

  Since commands operate at the device level, management operations are
  different than with traditional file systems.  Management commands
  for SCSI/iSCSI include explicit device control commands, such as
  starting, stopping, and formatting the device.





Alimi, et al.                 Informational                    [Page 33]

RFC 6392                      DECADE Survey                 October 2011


5.2.3.  Data Search Capability

  SCSI/iSCSI does not provide the ability to search for particular data
  within a device.  Note that such capabilities can be implemented
  outside of iSCSI.

5.2.4.  Access Control Authorization

  With respect to access to devices, the access control method is
  private.  iSCSI uses the Challenge Handshake Authentication Protocol
  (CHAP) [33] to authenticate Initiators and Targets when accessing
  storage devices.  However, since SCSI/iSCSI operates at the device
  level, neither authentication nor authorization is provided for
  individual data objects.  Note that such capabilities can be
  implemented outside of iSCSI.

5.2.5.  Resource Control Interface

  Not provided.

5.2.6.  Discovery Mechanism

  Manual configuration may be used.  An alternative is the Internet
  Storage Name Service (iSNS) [34], which provides the ability to
  discover available storage resources.

5.2.7.  Storage Mode

  As a protocol, iSCSI does not explicitly have a storage mode.
  However, it provides block-based access to clients.  SCSI/iSCSI
  provides an Initiator with block-level access to the storage device.

5.3.  NFS

  The Network File System (NFS) is designed to allow users to access
  files over a network in a manner similar to how local storage is
  accessed.  NFS is typically used in local area networks or in
  enterprise settings, though changes made in later versions of NFS
  (e.g., [35]) make it easier to operate over the Internet.

5.3.1.  Data Access Interface

  Traditional file-system operations such as read, write, and update
  (overwrite) are provided.  Locking is provided to support concurrent
  access by multiple clients.






Alimi, et al.                 Informational                    [Page 34]

RFC 6392                      DECADE Survey                 October 2011


5.3.2.  Data Management Operations

  Traditional file-system operations such as move and delete are
  provided.

5.3.3.  Data Search Capability

  The user has the ability to list contents of directories to find
  filenames matching desired criteria.

5.3.4.  Access Control Authorization

  All methods of access control for clients are supported: public-
  unrestricted, public-restricted, and private.  For example, files and
  directories can be protected using read, write, and execute
  permissions for the files' owner and group, and for the public
  (others).  Also, NFSv4.1 has a rich ACL model allowing a list of
  Access Control Entries (ACEs) to be configured for each file or
  directory.  The ACEs can specify per-user read/write access to file
  data, file/directory attributes, creation/deletion of files in a
  directory, etc.

5.3.5.  Resource Control Interface

  While disk space quotas can be configured, administrative policy
  typically limits the total amount of storage allocated to a
  particular user.  User control of bandwidth and connections used by
  remote peers is not provided.

5.3.6.  Discovery Mechanism

  Manual configuration is typically used.  Clients address NFS servers
  by providing a hostname and a directory that should be mounted.  DNS
  may be used to look up an address for the provided hostname.

5.3.7.  Storage Mode

  As a protocol, there is no defined internal storage mode.  However,
  implementations typically use the underlying file-system storage.
  Note that extensions have been defined for alternate storage modes
  (e.g., block-based [36] and object-based [37]).










Alimi, et al.                 Informational                    [Page 35]

RFC 6392                      DECADE Survey                 October 2011


5.3.8.  Comments

  The efficiency and scalability of the NFS access control method are
  concerns in the context of DECADE.  In particular, Section 6.2.1 of
  [35] states that:

     Only ACEs that have a "who" that matches the requester
     are considered.

  Thus, in the context of DECADE, to specify per-peer access control
  policies for an object, a client would need to explicitly configure
  the ACL for the object for each individual peer.  A concern with this
  approach is scalability when a client's peers may change frequently,
  and ACLs for many small objects need to be updated constantly during
  participation in a swarm.

  Note that NFSv4.1's usage of RPCSEC_GSS provides support for multiple
  security mechanisms.  Kerberos V5 is required, but others, such as
  X.509 certificates, are also supported by way of the Generic Security
  Service Application Program Interface (GSS-API).  Note, however, that
  NFSv4.1's usage of such security mechanisms is limited to linking a
  requesting user to a particular account maintained by the NFS server.

5.4.  OAuth

  Open Authorization (OAuth) [38] is a protocol that enriches the
  traditional client-server authentication model for Web resources.  In
  particular, OAuth distinguishes the "client" from the "resource
  owner", thus enabling a resource owner to authorize a particular
  client for access (e.g., for a particular lifetime) to private
  resources.

  We include OAuth in this survey so that its authentication model can
  be evaluated in the context of DECADE.  OAuth itself, however, is not
  a network storage protocol.

5.4.1.  Data Access Interface

  Not provided.

5.4.2.  Data Management Operations

  Not provided.

5.4.3.  Data Search Capability

  Not provided.




Alimi, et al.                 Informational                    [Page 36]

RFC 6392                      DECADE Survey                 October 2011


5.4.4.  Access Control Authorization

  Not provided.  While similar in spirit to the WebDAV ticketing
  extensions [39], OAuth instead uses the following process: (1) a
  client constructs a delegation request, (2) the client forwards the
  request to the resource owner for authorization, (3) the resource
  owner authorizes the request, and finally (4) a callback is made to
  the client indicating that its request has been authorized.

  Once the process is complete, the client has a set of token
  credentials that grant it access to the protected resource.  The
  token credentials may have an expiration time, and they can also be
  revoked by the resource owner at any time.

5.4.5.  Resource Control Interface

  Not provided.

5.4.6.  Discovery Mechanism

  Not provided.

5.4.7.  Storage Mode

  Not provided.

5.4.8.  Comments

  The ticketing mechanism requires server involvement, and the
  discussion relating to WebDAV's proposed ticketing mechanism (see
  Section 5.5.8) applies here as well.

5.5.  WebDAV

  WebDAV [40] is a protocol designed for Web content authoring.  It is
  developed as an extension to HTTP (described in Section 5.1), meaning
  that it can be simpler to integrate into existing software.  WebDAV
  supports traditional operations for reading/writing from storage, as
  well as other constructs, such as locking and collections, that are
  important when multiple users collaborate to author or edit a set of
  documents.

5.5.1.  Data Access Interface

  Traditional read and write operations are supported (using HTTP GET
  and PUT methods, respectively).  Locking is provided to support
  concurrent access by multiple clients.




Alimi, et al.                 Informational                    [Page 37]

RFC 6392                      DECADE Survey                 October 2011


5.5.2.  Data Management Operations

  WebDAV supports traditional file-system operations, such as move,
  delete, and copy.  Objects are organized into collections, and these
  operations can also be performed on collections.  WebDAV also allows
  objects to have user-defined properties.

5.5.3.  Data Search Capability

  The user has the ability to list contents of collections to find
  objects matching desired criteria.  A SEARCH extension [41] has also
  been specified allowing listing of objects matching client-defined
  criteria.

5.5.4.  Access Control Authorization

  All methods of access control for clients are supported: public-
  unrestricted, public-restricted, and private.

  For example, an ACL extension [42] is provided for WebDAV.  ACLs
  allow both user-based and group-based access control policies
  (relating to reading, writing, properties, locking, etc.) to be
  defined for objects and collections.

  A ticketing extension [39] has also been proposed, but has not
  progressed since 2001.  This extension allows a client to request the
  WebDAV server to create a "ticket" (e.g., for reading an object) that
  can be distributed to other clients.  Tickets may be given expiration
  times, or may only allow for a fixed number of uses.  The proposed
  extension requires the server to generate tickets and maintain state
  for outstanding tickets.

5.5.5.  Resource Control Interface

  An extension [43] allows disk space quotas to be configured for
  collections.  The extension also allows WebDAV clients to query
  current disk space usage.  User control of bandwidth and connections
  used by remote peers is not provided.

5.5.6.  Discovery Mechanism

  Manual configuration is typically used.  Clients address WebDAV
  servers by providing a hostname, which can be resolved to an address
  using DNS.







Alimi, et al.                 Informational                    [Page 38]

RFC 6392                      DECADE Survey                 October 2011


5.5.7.  Storage Mode

  Though no storage mode is explicitly defined, WebDAV can be thought
  of as providing file system (file-based) storage to a client.  A
  non-collection resource can typically be thought of as a "file".
  Files may be organized into collections, which typically map onto the
  HTTP path hierarchy.

5.5.8.  Comments

  The efficiency and scalability of the WebDAV access control method
  are concerns in the context of DECADE, for reasons similar to those
  stated in Section 5.3.8 for NFS.  The proposed WebDAV ticketing
  extension partially alleviates these concerns, but the particular
  technique may need further evaluation before being applied to DECADE.
  In particular, since DECADE clients may continuously upload/download
  a large number of small-size objects, and a single DECADE server may
  need to scale to many concurrent DECADE clients, requiring the server
  to maintain ticket state and generate tickets may not be the best
  design choice.  Server-generated tickets can also increase latency
  for data transport operations, depending on the message flow used by
  DECADE.

5.6.  Observations Regarding Storage and Related Protocols

  The following observations about the surveyed storage and related
  protocols are made in the context of DECADE as defined by [1].

  All of the surveyed protocols were primarily designed for client-
  server architectures and not for P2P.  However, it is conceivable
  that some of the protocols could be adapted to work in a P2P
  architecture.

  Several popular in-network storage systems today use HTTP as their
  key protocol, even though it is not classically considered as a
  storage protocol.  HTTP is a stateless protocol that is used to
  design RESTful applications.  HTTP is a well-supported and widely
  implemented protocol that can provide important insights for DECADE.

  The majority of the surveyed protocols do not support low-latency
  access for applications such as live streaming.  This was one of the
  key general requirements for DECADE.

  The majority of the surveyed protocols do not support any form of
  resource control interface.  Resource control is required for users
  to manage the resources on in-network storage systems, e.g., the
  bandwidth or connections, that can be used by other peers.  Resource
  control is a key capability required for DECADE.



Alimi, et al.                 Informational                    [Page 39]

RFC 6392                      DECADE Survey                 October 2011


  Nearly all surveyed protocols did, however, support the following
  capabilities required for DECADE: ability of the user to read/write
  content, some form of access control, some form of error indication,
  and the ability to traverse firewalls and NATs.

6.  Conclusions

  Though there have been many successful in-network storage systems,
  they have been designed for use cases different from those defined in
  DECADE.  For example, many of the surveyed in-network storage systems
  and protocols were designed for client-server architectures and not
  P2P.  No surveyed system or protocol has the functionality and
  features to fully meet the set of requirements defined for DECADE.
  DECADE aims to provide a standard protocol for P2P applications and
  content providers to access and control in-network storage, resulting
  in increased network efficiency while retaining control over content
  shared with peers.  Additionally, defining a standard protocol can
  reduce the complexity of in-network storage, since multiple P2P
  application protocols no longer need to be implemented by in-network
  storage systems.

7.  Security Considerations

  This document is a survey of existing in-network storage systems, and
  does not introduce any security considerations beyond those of the
  surveyed systems.

  For more information on security considerations of DECADE, see [1].

8.  Contributors

  The editors would like to thank the following people for contributing
  to the development of this document:

  - ZhiHui Lv

  - Borje Ohlman

  - Pang Tao

  - Lucy Yong

  - Juan Carlos Zuniga








Alimi, et al.                 Informational                    [Page 40]

RFC 6392                      DECADE Survey                 October 2011


9.  Acknowledgments

  The editors would like to thank the following people for providing
  valuable comments to various draft versions of this document: David
  Bryan, Tao Mao, Haibin Song, Ove Strandberg, Yu-Shun Wang, Richard
  Woundy, Yunfei Zhang, and Ning Zong.

10.  Informative References

  [1]   Song, H., Zong, N., Yang, Y., and R. Alimi, "DECoupled
        Application Data Enroute (DECADE) Problem Statement", Work
        in Progress, October 2011.

  [2]   Storage Search, "Flash Memory vs. Hard Disk Drives -- Which
        Will Win?", <http://www.storagesearch.com/semico-art1.html>.

  [3]   Brisken, W., "Hard Drive Price Trends", US VLBI Technical
        Meeting, May 2008.

  [4]   Woundy, R., "TSV P2P Efforts -- From an ISP's Perspective",
        IETF 81, Quebec, Canada, July 2011,
        <http://www.ietf.org/proceedings/81/slides/tsvarea-3.pdf>.

  [5]   Gu, Y., Bryan, D., Yang, Y., and R. Alimi, "DECADE
        Requirements", Work in Progress, September 2011.

  [6]   Amazon Web Services, "Amazon Simple Storage Service
        (Amazon S3)", <http://aws.amazon.com/s3/>.

  [7]   Calder, B., Wang, T., Mainali, S., and J. Wu, "Windows Azure
        Blob -- Programming Blob Storage", May 2009,
        <http://www.microsoft.com/windowsazure/whitepapers/>.

  [8]   Google, "Google Storage for Developers",
        <http://code.google.com/apis/storage>.

  [9]   Dropbox, "Dropbox Features", <http://www.dropbox.com/features>.

  [10]  Microsoft Corporation, "BranchCache",
        <http://technet.microsoft.com/en-us/network/dd425028.aspx>.

  [11]  Microsoft Corporation, "Web Services Dynamic Discovery
        (WS-Discovery)", April 2005, <http://specs.xmlsoap.org/
        ws/2005/04/discovery/ws-discovery.pdf>.







Alimi, et al.                 Informational                    [Page 41]

RFC 6392                      DECADE Survey                 October 2011


  [12]  Paul, S., Yates, R., Raychaudhuri, D., and J. Kurose, "The
        Cache-and-Forward Network Architecture for Efficient Mobile
        Content Delivery Services in the Future Internet", Innovations
        in NGN: Future Network and Services, 2008.

  [13]  SNIA, "Cloud Data Management Interface (CDMI)",
        <http://www.snia.org/cdmi>.

  [14]  Pathan, A.K. and Buyya, R., "A Taxonomy and Survey of Content
        Delivery Networks", Grid Computing and Distributed Systems
        Laboratory, University of Melbourne, Technical Report,
        February 2007.

  [15]  Cerf, V., Burleigh, S., Hooke, A., Torgerson, L., Durst, R.,
        Scott, K., Fall, K., and H. Weiss, "Delay-Tolerant Networking
        Architecture", RFC 4838, April 2007.

  [16]  Scott, K. and S. Burleigh, "Bundle Protocol Specification",
        RFC 5050, November 2007.

  [17]  Named Data Networking, "Named Data Networking Home Page",
        <http://www.named-data.net/>.

  [18]  Named Data Networking, "Named Data Networking (NDN) Project",
        <http://www.named-data.net/ndn-proj.pdf>.

  [19]  Network of Information, "NetInf Overview",
        <http://www.netinf.org/home/overview/>.

  [20]  Anand, A., Sekar, V., and A. Akella, "SmartRE: An Architecture
        for Coordinated Network-wide Redundancy Elimination",
        SIGCOMM 2009.

  [21]  Rhea, S., Eaton, P., Geels, D., Weatherspoon, H., Zhao, B., and
        J. Kubiatowicz, "Pond: the OceanStore Prototype", FAST 2003.

  [22]  Kodak, "Kodak Gallery Home Page",
        <http://www.kodakgallery.com/gallery/welcome.jsp>.

  [23]  Wikipedia, "Kodak Gallery",
        <http://en.wikipedia.org/wiki/Kodak_Gallery>.

  [24]  Flickr, "Flickr Home Page", <http://www.flickr.com>.

  [25]  ImageShack, "ImageShack Home Page", <http://imageshack.us>.

  [26]  Tumblr, "Tumblr Home Page", <http://www.tumblr.com>.




Alimi, et al.                 Informational                    [Page 42]

RFC 6392                      DECADE Survey                 October 2011


  [27]  Wikipedia, "Usenet", <http://en.wikipedia.org/wiki/Usenet>.

  [28]  Google, "Google Groups", <http://groups.google.com>.

  [29]  Huston, G., Telstra, "Web Caching", The Internet Protocol
        Journal Volume 2, No. 3.

  [30]  Shen, G., Wang, Y., Xiong, Y., Zhao, B., and Z-L. Zhang, "HPTP:
        Relieving the Tension between ISPs and P2P", 6th International
        Workshop on Peer-To-Peer Systems (IPTPS2007).

  [31]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
        Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol --
        HTTP/1.1", RFC 2616, June 1999.

  [32]  Satran, J., Meth, K., Sapuntzakis, C., Chadalapaka, M., and E.
        Zeidner, "Internet Small Computer Systems Interface (iSCSI)",
        RFC 3720, April 2004.

  [33]  Simpson, W., "PPP Challenge Handshake Authentication Protocol
        (CHAP)", RFC 1994, August 1996.

  [34]  Tseng, J., Gibbons, K., Travostino, F., Du Laney, C., and J.
        Souza, "Internet Storage Name Service (iSNS)", RFC 4171,
        September 2005.

  [35]  Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., "Network
        File System (NFS) Version 4 Minor Version 1 Protocol",
        RFC 5661, January 2010.

  [36]  Black, D., Fridella, S., and J. Glasgow, "Parallel NFS (pNFS)
        Block/Volume Layout", RFC 5663, January 2010.

  [37]  Halevy, B., Welch, B., and J. Zelenka, "Object-Based Parallel
        NFS (pNFS) Operations", RFC 5664, January 2010.

  [38]  Hammer-Lahav, E., Ed., "The OAuth 1.0 Protocol", RFC 5849,
        April 2010.

  [39]  Ito, K., "Ticket-Based Access Control Extension to WebDAV",
        Work in Progress, October 2001.

  [40]  Dusseault, L., Ed., "HTTP Extensions for Web Distributed
        Authoring and Versioning (WebDAV)", RFC 4918, June 2007.

  [41]  Reschke, J., Ed., Reddy, S., Davis, J., and A. Babich, "Web
        Distributed Authoring and Versioning (WebDAV) SEARCH",
        RFC 5323, November 2008.



Alimi, et al.                 Informational                    [Page 43]

RFC 6392                      DECADE Survey                 October 2011


  [42]  Clemm, G., Reschke, J., Sedlar, E., and J. Whitehead, "Web
        Distributed Authoring and Versioning (WebDAV)
        Access Control Protocol", RFC 3744, May 2004.

  [43]  Korver, B. and L. Dusseault, "Quota and Size Properties
        for Distributed Authoring and Versioning (DAV) Collections",
        RFC 4331, February 2006.

Authors' Addresses

  Richard Alimi (editor)
  Google

  EMail: [email protected]


  Akbar Rahman (editor)
  InterDigital Communications, LLC

  EMail: [email protected]


  Yang Richard Yang (editor)
  Yale University

  EMail: [email protected]

























Alimi, et al.                 Informational                    [Page 44]