What is a Grid?

What is a Grid?

CETIC’s Tentative Definition

Grid computing is emerging as an important new field, distinguished from conventional distributed computing by focusing on very large-scale resource sharing.
Various definitions try to describe Grid, but even the most popular ones are disputed. We want to present an approach of these definitions’ controversy before proposing our tentative definition of Grid computing.

Date: 2 May 2006

Expertises:

Engineering of complex IT systems 

About project: CoreGRID 

Current Definitions and their Limitations

The CoreGRID European Network of Excellence defines Grid as follows: "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide location independent, pervasive, reliable, secure and efficient access to a coordinated set of services encapsulating and virtualizing resources (computing power, storage, instruments, data, etc.) in order to generate knowledge."

Even if this definition gives an interesting and rigorous view of Grid concept, some assertions could be contested. Indeed, this definition implies that an inefficient or unsecured system is never a Grid. We think these assertions belong to Grid requirements (common features), but they do not only define what a Grid is. We underline also that a distributed file storage system whose facilities provide some security and availability can be deployed as a grid system and it would never generate any knowledge (but just preserve it.)

According to Ian Foster’s well-known "What is the Grid ? A three-point checklist" paper, a Grid is "a system that coordinates resources that are not subject to centralized control, using standard, open, general-purpose protocols and interfaces, to deliver nontrivial qualities of service.". Most authors refer to this definition but a considerable part finds it erroneous, insufficient or incomplete.

Now we want to try describing Grid technologies from a different point of view...

Back to the Origins

To better understand the Grid’s concept, it is useful to go back to its genesis and its evolution. One of the Grid’s supposed ancestor is meta computing, defined in the eighties by Larry Smarr: "to inter-connect large systems to mutualize resources". The term "Grid" was coined in the mid 1990s to denote a proposed distributed computing infrastructure for advanced science and engineering. The first really operational infrastructure was I-way. The initial goal of Grid computing was to access resources as easily as accessing electricity (which gives a need of abstraction for a grid system). A Grid should give the illusion (virtually) of a super-powerful computer allowing groups of users or institutions to share their resources dispersed in order to work towards common goals.

Current definitions of the grid are numerous and varied: "a set of delocalized resources...", "a system that shares resources", etc. We have the feeling that researchers fall into the trap of a too specific/particular definition. We should then explain the Grids first before defining the Grid.

No matter what one says of the Grid, it is not only a system and Grid is probably not only a software either because mechanisms and concepts involved by this term describe a technology. We argue that a Grid is a concept, a technology (a set of methods and techniques) which can be structured as follows:

  • a material infrastructure: set of interconnected resources
  • an organization (virtual): users and institutions (which mutualize their own resources )
  • a computing system: a platform
  • a software framework : an application

What lies Behind the "Grid"

Grid is a word very much used! Let’s explore separately each part of the Grid’s structure as it will enable us to better understand the Grid and
provide a more appropriate definition.

First, we have to define here what a Grid infrastructure is. A Grid site (or domain) is a material and software infrastructure of heterogeneous resources under the same administrative control. Heterogeneous means resources differing by: type (storage, calculation, sensor, software...), hardware (PC, cluster, smartphone...), platform (OS, architecture...), user, owner... A Grid infrastructure is a virtual association of sites.

Another main concept in grid is virtual organization which comes with the following assumptions. A Grid gathers individual consumers of resources and/or suppliers of resources. These people collaborate ("work in concert together") and protect their interests (the various
objectives that justify why their collaborate). The individuals collaborate within the grid individually or collectively (in company, institution, Internet community...)

A virtual community defines a community which does not materialize by physical meetings but which collaborates only online. So, a virtual organization (Grid) is a virtual community of participating parts consuming and/or suppliers of sites resources collaborating according to sharing policies. An association of suppliers and/or consumers of resources sharing the same interests within a grid constitute a participating part (a participant, a VO member).

Examples of such sharing policies:

  • Resources donation (volunteer computing)
  • Resources mutualization (mutual sharing of the resources and common use, exploitation)
  • Rewarding (financial, computational) for each provided resource

Note that each site is generally under individual sharing policies because the opposite would bring some complexity to management and security problems.

The unit consisted by the grid infrastructure constitutes a computing system. Services given by such computing system are not common. But we cannot argue that "it gives huge computational power", "it generates knowledge", or whatever. The combined resources are just making it possible to provide some huge performances in computing, storing, ... So, we just argue (as I. Foster does) that a Grid delivers nontrivial qualities of services.

A software (eventually hardware) framework is needed to virtualizes the previous concepts. In computing, middleware consists of software agents acting as an intermediary between different application components. Middleware are needed on resources to enable the system to manage these resources and softwares are needed to manage the system (resources management system, schedulers, resources broker, ...).

Our Final Definition

Given all this, here is our final attempt for a definition. The Grid is a technology making it possible to carry out a system, delivering
non-trivial qualities of service, and which virtualizes with a software and hardware framework (middleware, scheduler, resources management system...) :

  • an association of a set of sites (material and software infrastructure of heterogeneous resources under the same administrative control);
  • a community of participating parts consuming and/or suppliers of sites resources, collaborating according to sharing policies.