TOOLKITS, WEB-SERVICES and Commercial GRID

OGSA-DAI Architecture:

Grid Database Services(GDS): Instance of database, provide access to the stroed data through a standard port type.

Grid Database service factories(GDSF): It can be a instance of GDS but basically it generates GDS, i.e. data storage system according to the associated database models.

Grid Data Transport Vehicle(GDTV): Abstraction layer over a bulk of transmission.

Grid Data Service Registers (GDSR): These allows GDS and GDSF to register and then allow client code to query GDSR to find data sources of GDS or GDSF.

Grid data Metadata: Metadata of Datamodel, languages supported by GDS, Xquery.

Figure: OGSA-DAI architecture

Another essential part is the resource sharing within Virtual Organization and between real organization and recording the usage.

  • number of bytes crossing network gateways or boundaries
  • bytes seconds of storage
  • CPU-hours
  • records in databases
  • uses of licensed software.

Background technologies on which we build to define the Open Grid Services Architecture.

The Globus Toolkit: Community based, open architecture, open-source set of services and software libraries that supports Grids and Grid applications. The toolkit addresses issues of security, information discovery, resource management, data management, communication, fault detection and portability.

The  toolkit  components  that  are  most  relevant  to  OGSA  are  the  Grid  Resource Allocation  and  Management  (GRAM)  protocol  and  its  ‘gatekeeper’  service,  which provides for secure, reliable service creation and management [18]; the Meta Directory Service  (MDS-2),  which  provides  for  information  discovery  through  soft-state registration, data modeling, and a local registry (‘GRAM reporter’); and the Grid security infrastructure (GSI), which supports single  sign-on, delegation, and credential mapping. These components provide the essential elements  of  a  service-oriented  architecture,  but  with  less  generality  than  is achieved in OGSA.

The GRAM protocol provides for the reliable, secure remote creation and management of arbitrary computations: what we term in this article as transient service instances. GSI mechanisms are used for authentication, authorization, and credential delegation to remote computations. A two-phase commit protocol is used for reliable invocation, based on techniques used in the Condor system. Service creation is handled by a small, trusted ‘gatekeeper’ process (termed a factory  in this article), while a GRAM reporter monitors and publishes information about the identity and state of local computations (registry).

MDS-2 [19] provides a uniform framework for discovering and accessing system configuration and status information such as compute server configuration, network status, or the locations of replicated datasets (what we term a discovery interface in this chapter). MDS-2 uses a soft-state protocol, the Grid Notification Protocol , for lifetime management of published information.

_______________________________________________________________

Grid Web Services and application factories:

Factory Service: A factory service is a secure and a stateless persistent service that knows how to create an instance of transient, possibly stateful, service

  • The user contacts factory service through a secure Web portal or a direct secure connection from a factory service client.
  • The factory service authenticates the user, by checking the access control list and authorize to run the simulation service.
  • The factory service starts instances of a data provider and simulation component, and these components may needs to communicate factory services for consultation of resource selectors and workload managers.
  • In this model, the factory service would now need to obtain a proxy certificate from the user to start the computations on the user’s behalf. However, this delegation is unnecessary if the resource providers trust the factory service and allow the computations to be executed under the service owner’s identity.
  • Access to this distributed application is then passed from the factory service back to the client. The easiest way to do this is to view the entire distributed application instance as a transient, stateful Web service that belongs to the client.
  • The factory service is now ready to interact with another client.

Dynamically created connected sets of component instances represent the distributed application that has been invoked on behalf of the end users. Here, the database query component provides a Web service interface to users and when invoked by a user, it consults the database and contacts the analysis component. The analysis component on receiving data, analyze the code and returns it to users.

In the above case the components are connected in chain, the primary advantage of the following CCA model with WSFL engine is that WSFL engine must intermediate at each step of application sequence and relay the service to the next service. But disadvantage, probably this is not a good way if heavy amount of data transfer is required between components.

Each application is described by three documents:

The Static Application Information  is  an XML document that describes the list  of components used in the computation, how they are to be connected, and the ports of the ensemble that are to be exported as application ports.

The Dynamic Application Information  is another XML document that describes the bindings of component instances to specific hosts and other initialization data.

The Component Static Information is an XML document that contains basic information about the component and all the details of its execution environment for each host on which it has been deployed. This is the information that is necessary for the application coordinator component to create a running instance of the component. The usual way to  obtain  this  document is  through a call to a simple  directory service, called the Component Browser, which allows a user to browse components by type name or to search for components by other attributes such as the port types they support.

The application factory service described here provides a Web service model for launching distributed Grid applications. Security in XCAT is based on Secure Sockets Layer (SSL) and Public Key Infrastructure (PKI) certificates.

____________________________________________________________

From Legion to Avaki:

GRID architecture Requirements

A Grid system, also called a Grid, gathers resources – desktop and handheld hosts, devices with embedded processing resources such as digital cameras and phones or tera-scale supercomputers – and makes them accessible to users and applications in order to reduce overhead and to accelerate projects. A Grid application can be defined as an application that operates in a Grid environment or is ‘on’ a Grid system. Grid system software (or middleware) is software that facilitates writing Grid applications and manages the underlying Grid infrastructure.

  • Security
  • Global Namespace
  • Fault Tolerance
  • Accommodating heterogeneity
  • Binary management
  • Multilanguage Support
  • Scalability
  • Persistence
  • Extensibility
  • Site anatomy
  • Complexity Management

Legion Principles :

  • Provide a single system View: The operating system will provide an abstraction or illusion that our LAN is a single computing resource. To create this illusion we need to provide a veil on heterogeneous architecture of the systems.
  • Provide flexible semantics: Rigid system design is can never be a solution of Grid architecture. Legion provides default implementation of the systems core object, and also provides extensible, replaceable components.
  • By default users should not have to think : Four class of users: End users of apllications, application developers, system administrators and resource owners. It is believed that users can focus on their job without knowing the underlying architecture of the grid.
  • Reduce activation energy: If a service or component is not needed it should not be used and the users don’t have to pay for that service.
  • Do not change host operating systems, Network interfaces, and don’t require privilege mode of execution.

LSF: Load Sharing Facility, SGE : Sun Grid Engine
Legion Deployment

LSF: Load Sharing Facility, SGE : Sun Grid Engine

Figure: Legion Deployment

  • Single Administrative domain
  • Federation of multiple administrative domains

Legion Data Grid: Two Basic concepts to understand Legion Data Grid

  • How the data is accessed
  • How the data is included into grid

Data Access: Data can be accessed in 3 ways

1)     A Legion-aware NFS server called a Data Access Point

2)     Through a set of command line access

3)     I/O Libraries

Data Inclusions:

1)     Copy Mechanism i.e. copying file into grid

2)     Container mechanism i.e. copy is made in a container

3)     Share mechanism, data remains in original machine.

Distributed Processing :

The features needed in distributed processing are:

Automated resource matching and file staging: A legion Grid user executes application referencing file and application name. For secure access and implementation  we need to verify with administrative controls, predefined policies.

Support for Legacy application

Batch processing: queues and scheduling

Automatic failure detection and recovery:

Legion provides fast, transparent recovery from outages:  In the event of an outage, processing and data requests are rerouted to other locations, ensuring continuous operation.

Systems can be reconfigured dynamically:  If a computing resource must be taken offline for routine maintenance, processing continues using other resources.

Legion migrates jobs and files as needed:  If a job’s execution host is unavailable or cannot be restarted, the job is automatically migrated to another host and restarted.

Emerging Standards

The Global Grid Forum an Open Grid Services Architecture (OGSA) was proposed by IBM and the Globus PIs. The proposed architecture has many similarities to the Avaki architecture. One example of this congruence is that all objects (called Grid Resources in OGSA) have a name, an interface, a way to discover the interface, metadata and state, and are created by factories (analogous to Avaki class objects). The primary differences in the core architecture lie in the RPC model, the naming scheme and the security model.

The RPC model differences are of implementation – not of substance. This is a difference  that  Avaki  intends  to  address  by  becoming  Web  Services compliant,  that is, by supporting XML/SOAP (Simple Object Access Protocol) and WSDL (Web Services Description Language).

Difference in OGSA and Legion

Here two low-level name schemes immutable, location-independent name (Grid Services Handle (GSH) in OGSA, LOID in Avaki) and a lower-level ‘address’ (a WSDL Grid Service Reference (GSR) in OGSA and an OA in Avaki)

OGSA names have no security information in them at all, requiring the use of an alternative mechanism to bind name and identity, and  binding resolvers in OGSA currently are location- and protocol-specific, severely reducing the flexibility of the name-resolving process.

Avaki has proposed the Secure Grid Naming Protocol (SGNP) to the Global Grid Forum (GGF) as an open standard for naming in Grids. SGNP fits quite well with OGSA, and we are actively working with IBM and others within the GGF working group process to find the best solution for naming.

Avaki is a complete, implemented system with a wealth of services. OGSA is a core architectural proposal – not a complete system.

__________________________________________________________

CONDOR and the GRID:

CONDOR is a multifaceted project engaged in five primary activities.

Research in distributed computing:

  1. harnessing the power of opportunistic and dedicated resources
  2. job  management services
  3. Fabric management services
  4. Resource discovery, monitoring, and management.
  5. Problem-solving environments.
  6. Distributed I/O technology.

Participation of scientific community: Condors participate in national and international Grid research, development, deployment efforts. Participants are (GriPhyN) Grid Physics Network, Virtual Data Grid Laboratory(iVDGL), NASA Information Power Grid(IPG).

Engineering of Complex Software

Maintenance of production environments: The Condor project is also responsible for the Condor installation in the Computer Science Department at the University of Wisconsin Madison, which consist of over 1000 CPUs. This installation is also a major compute resource for the Alliance Partners for Advanced Computational Servers (PACS).  As such, it delivers compute cycles to scientists across the nation who have been granted computational resources by  the  National Science Foundation.

Education of students: Another aim of Condor project to train students to make them computer scientists.

Condor in Grid middleware

The key to lasting system design is to outline structures first in terms of responsibility rather than expected functionality. The apparent complexity preserves the independence of each component. We may update one with more complex policies and mechanisms without harming another.

The Condor project will also continue to grow. The project is home to a variety of systems research ventures in addition to the flagship Condor software. These include the Bypass  toolkit, the ClassAd  resource management language, the Hawkeye cluster management system, the NeST storage appliance, and the Public Key Infrastructure Lab. In these and other ventures, the project seeks to gain the hard but valuable experience of nurturing research concepts into production software. To this end, the project is a key player in collaborations such as the National Middleware Initiative(NMI) that aim to harden and disseminate research systems as stable tools for end users. The project will continue to train students, solve hard problems, and accept and integrate good solutions from others.

________________________________________________________

Architecture of commercial enterprise GRID: Entropia system:

For over the past decade, distributed computing, which is the assembly of large number of PCs over the internet has become the largest computing system of the world. This systems demonstrate solving of a surprisingly wide range of large scale computational problems in the area of molecular interaction, financial modeling, data mining etc.

With the distributed computing one can achieve a superior cost per unit computing as  well as cheapest hardware alternatives by as much as a factor of 5-10.  The distributed computing also refers to high throughput computing as well to desktop grids. A lot of research has been undertaken to share the resources in new ways. The entropia system is a desktop grid that can provide massive quantities of resources and will be naturally integrated with server resources into and enter price grid.

While the tremendous computing resources available through distributed computing present new opportunities, harnessing them in the enterprise is quite challenging. Because distributed computing exploits existing resources, to acquire the most resources, capable  systems  must  thrive  in  environments  of  extreme  heterogeneity  in  machine  hardware and software configuration, network structure, and individual/network management practice. To achieve a high degree of utility, distributed computing must capture a large number of valuable applications and also the systems must supports large number of resources, thousands to million of computers, to achieve their promise of tremendous power, and do so without requiring plenty of IT administrators.

The  key  advantages  of  the  Entropia  system  are  the  ease  of  application integration, and a new model for providing security and unobtrusiveness for the application  and  client  machine.  Applications  are  integrated  using  binary  modification technology without requiring any changes to the source code. This binary integration automatically ensures that the application is unobtrusive, and provides security and protection for both the client machine and the application’s data. This makes it easy to port applications to the Entropia system. Other systems require developers to change their source code to use custom Application Programming Interfaces (APIs) or simply provide weaker security and protection.

The growth of WWW and exploding popularity of Internet created a huge opportunity for distributed computing the scale of resources, the type of systems and the typical ownerships and managements gave rise to a new explosion of internet in a new set of technical challenges for distributed computing.

the GIMPS (Grid Internets Mersenne Prime search) was the first project taken up by Entropia Inc., a startup commercialize grid computing. Such similar early internet distributed computing systems showed that aggregation of very large scale resources was possible and that the resulting system dwarfed the resources of any single super computer, at least for certain class of applications.

The current generation of distributed computing systems, a number of are commercial ventures, provide the capability to run multiple application on a collection of desktop and server computing  resources providing tools for application integration and robust integration.

Requirements of distributive computing consist of:

  • Efficiency
  • robustness
  • security
  • scalability
  • manageability
  • Unobtrusiveness
  • openness/ease of application integration

The entropia system addresses the above requirements by aggregating the raw desktop resources into a single logical resource. The logical resource provides high performance for applications through parallelism while always respecting the desktop user and his use of the desktop machine, addition or removal of desktop machines is easily achieved providing a simple mechanism. Binary Sandboxing technique is employed to support a large number of applications, and to support them securely.

 

Programming Desktop Grid Applications:

Each layer of entropia higher level abstraction, hiding the complexity of the underlying architecture, thus it can leverage existing job coordination and management.

Typically ussers are focused on integrating desktop grid with linux clusters,  database servers, large scale computing machine or super computers.

Single submission: Users prefer to select the resources automatically which gives best turnaround time. Since the resources are dynamic, and resource configuration is changing frequently scheduler place an important role.

Large Data Application: Canonical copies of data is maintained in Relational Databases. This entropia system provides mechanism to manage data copies in the desktop grid, hence providing a maximum computational speed up.

<<Previous


Next>>

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: