This page last changed on Sep 03, 2010 by bbranan.

This diagram depicts how interactions are likely to occur in a particular use case where DuraCloud is used as a backup system for data being stored within an institutional repository. The repository in this diagram is simply being used as a concrete example and could be replaced with any other software system which writes data to the file system.

DuraCloud Interactions

There are three types of interactions illustrated here:

  1. Moving files between the file system and DuraCloud
    • This file transfer can be accomplished in several ways
      • Using the DuraCloud provided Sync Tool which can be pointed to a set of file directories and will upload all underlying files as well as continue to monitor those directories and upload any changes
      • Using the DuraCloud provided Java client in order to write a program to handle the file transfer
      • Using the DuraCloud REST APIs to transfer files with command line tools, scripts, or programs in other languages
  2. Applications (in this case a repository system) using content stored in DuraCloud as well as DuraCloud services
    • As noted above, the DuraCloud REST APIs and Java clients are available to be used by any application. This provides opportunities for those applications to directly take advantage of the capabilities and services provided by DuraCloud.
  3. Administering DuraCloud
    • There are several activities which administrators will use to customize their DuraCloud instance to the needs of their institution
      • Running services: Not all of the services provided by DuraCloud will be useful to all organizations, but there are likely to be at least a few which provide a great benefit. Administrators will select, configure, and run those services
      • Add metadata and tags: DuraCloud provides for storing metadata and tags along with content and spaces. They can be added by hand through the UI or in an automated fashion using the REST API or Java client
      • Administer users: DuraCloud allows administrators to add, remove, and update those users which are allowed to access non-public content
      • Run backup and restore jobs: Administrators will be able to add and retrieve files as needed to support local operations

This diagram depicts an example Primary User Compute Instance (Primary UCI: gray box), and four underlying storage providers.

Primary User Compute Instance (Primary UCI)

Once a user creates a DuraCloud account, a new Primary UCI is started on their behalf and initialized based on the account configuration.
Most of the interaction with DuraCloud will be through the Primary UCI.

Fundamentally, the Primary UCI consists of a set of web applications that expose three general capabilities

  1. storage management
  2. service management
  3. a graphical web interface providing convenient access to the features in 1 and 2

The Primary UCI is assigned a public I.P. address which can be mapped to a user owned domain name or a domain name provided by DuraCloud.
Through this base URL and published storage and service REST APIs, the user can by-pass the Administration UI (DurAdmin) and interact directly with the Storage Management and Service Management components.

The Services component and supporting storage providers are addressed below.


This diagram depicts a further introspection of the storage management component (DuraStore) and DurAdmin's interaction therewith.

DuraStore

DuraStore is a web application deployed on the Primary UCI which is responsible for mediating the interaction with content stored in any one or more of the underlying storage providers.
It exposes these content management functions via a public-facing REST API.
Such functions include but are not limited to

  • content upload / download / deletion
  • metadata and tag creation / deletion
  • space* creation / deletion
  • space* visibility (public or private)

*A Space within the context of DuraCloud is a container within which content is stored.

Below the REST API is a Storage Mediation layer that translates requests into specific calls for the appropriate Storage Provider adapter implementation.
One of the value propositions of DuraCloud is its ability to abstract away the details and idiosyncrasies of the varied underlying storage provider APIs.
For each of the supported storage providers (currently: Amazon S3, Rackspace CloudFiles, and EMC Atmos) we have created adapters that implement a common Storage Provider Interface.
In this way, new storage providers can be dropped into the architecture by implementing the same interface.

DurAdmin

DurAdmin is shown here to highlight the fact that its interaction with DuraStore happens exclusively through the StoreClient (provided with the application distribution), a Java wrapper over the storage REST API.


This diagram is the services counterpart to the previous storage management diagram above.

DuraService

DuraService is a web application deployed on the Primary UCI which is responsible for mediating the interactions with deployed or available user services.
It exposes these service management functions via a public-facing REST API.
Such functions include but are not limited to

  • service deployment / undeployment
  • runtime service configuration
  • service property listings

The notion of a service in the DuraCloud context can be defined as any compute activity that can be started, stopped, configured, and monitored as an implementation of the Compute Service Interface.
DuraService passes service management requests to the Service Manager which has access to

  • available services and their configuration options from the DuraStore Service Repository
  • deployed services along with their current configuration and properties from Services Admin
Service Registry

DuraCloud maintains a Service Registry that contains a set of versioned Service Packages available for runtime deployment.
These Service Packages consist of OSGi bundles and support files that can be community-provided as well as developed by DuraSpace.
The Service Config is an XML document that acts as a table of contents for available services and their associated configuration options.

OSGi Container: Services Admin

As an implementation note, the need to dynamically start, stop, and reconfigure services at runtime without performing application restarts drove the decision to employ an OSGi Container as DuraCloud's service hosting environment.
Although the OSGi Container is depicted as a single component deployed in the Primary UCI, if it were advantageous for a particular service to be run on a Managed UCI, a separate OSGi Container could just as well be deployed on another compute instance.

Services Admin and each of the depicted Service elements are independent, OSGi bundles.
All requests from DuraService for starting, configuring, etc., services passes through Services Admin.
Such requests are then mediated to the appropriate service within the OSGi Container via the common Compute Service Interface.
Even though the management of any given service requires a representative OSGi bundle that exports an implementation of the Compute Service Interface, a variety of actual service flavors is supported.
Such service implementations include

  • web applications (e.g. Adore-Djatoka)
  • command line utilities (e.g. ImageMagick)
  • pure Java (e.g. Replication Service)
  • external compute instances
DurAdmin

As in the previous diagram, DurAdmin is shown here to highlight the fact that its interaction with DuraService happens exclusively through the ServiceClient (which is provided with the application distribution), a Java wrapper over the services REST API.


This diagram pulls the perspective back up to a higher level from the previous two, depicting the compute instance deployment topology. Note that this is a planned architecture which has not yet been deployed.

DuraCloud is comprised of the following compute instance types

  1. DuraCloud Instance
    • Public's initial contact with DuraCloud
    • Singular compute instance managed by DuraSpace
    • Supports user account creation / management
    • Starts and manages Primary UCI upon user account creation via Compute Manager component
  2. Primary User Compute Instance (UCI)
    • User-facing instance
    • Exposes DuraService and DuraStore functionality via
      • graphical web UI (DurAdmin)
      • respective REST APIs
    • Hosts local OSGi Container and locally deployed services
    • Starts and manages
      • Managed UCIs required for resource intensive or distributed services
      • Pre-Configured UCIs
  3. Managed UCI
    • Compute instance that is solely used in support of services needing dedicated compute resources
    • Hosts the same DuraCloud services that can be deployed on a Primary UCI (described in DuraCloud Service Architecture above)
    • May or may not have public URL endpoints
  4. Pre-Configured UCI
    • A compute instance that is based on a pre-configured server image
    • Hosted applications are limited only by the operating system of the base image
    • Allows for maximum flexibility in service application
    • DuraCloud has reduced ability for granular monitoring of application status

This diagram illustrates the multiple levels of security provided by DuraCloud.

Instance Firewall
  • The first line of defense for DuraCloud is the firewall surrounding the Primary UCI. This firewall is constructed to allow connections only through the standard HTTP (80) and HTTPS (443) ports. The requests coming in via HTTP are redirected to HTTPS to ensure secure transport.
Transport Security
  • As noted above, all communication with DuraCloud is via HTTPS, meaning that all information that goes in or comes out of DuraCloud is encrypted for transport. This ensures that transmissions into or out of DuraCloud can be read only by the intended recipient.
DuraCloud Application Security
  • The DuraCloud application itself controls access to content to ensure that only registered users (who must be added by an Administrator), with the proper role, can perform tasks (such as downloading content).
Storage Provider Access Control
  • DuraCloud uses the access control mechanisms of the underlying storage providers to lock down access, ensuring that all actions involving content must occur through the DuraCloud applications. This allows DuraCloud to control access to content as described above without concerns about what might be occuring in the underlying providers.

This diagram illustrates the planned service registry architecture. A service registry is a storage location which holds a set of DuraCloud service packages. The DuraCloud software can obtain information from the registry in order to display the list of services available from that registry, and allow for the deployment of the services contained therein. This diagram indicates the plan to have three distinct service registries for slightly different purposes:

Private Service Registry
  • A private service registry will be where DuraCloud account holders place their personal services. These services will likely have been created by or for the user to fulfill specific content processing needs. No other DuraCloud users will have access to services located in a private service registry. As illustrated in the diagram, these services will be stored in a user's personal storage area.
Public Service Registry
  • The Public Service Registry will be where services from private service registries can be published to make them available for others to use. The intention is to allow these services to be available to use for free or for a fee. The determination of the fee would be set by the creator of the service. While these services will be available to all, their use will be on a "at your own risk" basis. The services in the Public Service Registry will be stored in a centralized DuraCloud storage location.
Verified Service Registry
  • Services in the Verified Service Registry are those which have been submitted to and verified by the DuraCloud committer team. These services are tested to ensure that they function properly within the DuraCloud system and perform the actions they are intended to perform. They, like the services in the Public Service Registry, will likely be available either for free or for a fee, as determined by the service creator. These services will also be stored in a centralized DuraCloud storage location.

Slide5.png (image/png)
Slide4.png (image/png)
Slide1.png (image/png)
Slide2.png (image/png)
Slide3.png (image/png)
Slide6.png (image/png)
Slide7.png (image/png)
Document generated by Confluence on Apr 27, 2011 14:55