|
|
Dave
Angulo University
of Chicago Argonne National Laboratory |
Gregor
von Laszewski Mathematics
and Computer Science Division Argonne National Laboratory |
|||
|
|
|||||
|
|
|||||
|
|
|
Ian
Foster Mathematics
and Computer Science Division Argonne National Laboratory |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
||
Problem
The
GrADS project is comprised of several large modules, which are interconnected
through the exchange of information.
Traditional information interchange is done through function call APIs
to libraries. We advocate herein a more
sophisticated, more widely accessible method that facilitates debugging and helps
to ensure that modules work together as a whole while not increasing
development efforts nor discernibly increasing overhead. This technique is based on the use of
standard internet based protocols with concomitant use of standardized data
formatting tools.
Requirements
The GrADS project has several
large functional components. Each of
these components will be developed in isolation at disperse sites. A major concern of this project is that the
interconnectivity of these components might collapse. A major reason for this concern is that each
component is being developed at sites distantly removed from each other. The project will suffer a major setback if
the output prepared by the developers of one module do not correspond with the
input expected by the developers of the next one.
A second major concern of
this project is imposed on it from the applications groups (which includes
developers of Cactus applications).
These developers have stressed extremely strenuously that they require
to be allowed to have human intervention at the interfaces between each pair of
components. They require that each of
the components be treated as a service and not as a library (according to the
head of the applications team, Dr. Gannon of Indiana University).
Example Scenario
It might be beneficial at this point to describe one of the scenarios given for communication between a pair of these modules. This example involves the communication between the module that performs scheduling and resource selection and the module that performs performance prediction (called the Selector and Predictor herein for the sake of brevity). The Selector will pass a selection of nodes to the Predictor and ask for a prediction of performance. This will be done iteratively until the Predictor returns a prediction that fits the requirements of the Selector.The communications between this pair of modules consists of the Selector sending a connected set of nodes to the Predictor for analysis. The Predictor then replies with a performance prediction. The connected set of nodes will be sent (abstractly) as a weighted directed acyclic graph. The weightings on the connections represents the communication cost. The performance prediction might be a single numeric entry.
The challenge is to find a
means of representing this weighted directed acyclic graph in a manner that is
efficient, verifiable by team members for both modules (in isolation from each
other), human readable (to fulfill the requirements of Dr. Gannon’s
applications team), and that doesn’t increase development efforts.
Choices
There are several potential
technologies that can be utilized to transmit communication data between the
modules. These technologies include (1)
traditional function call APIs to access libraries, (2) internet services
encapsulated in CORBA objects, and (3) standard internet based protocols with
concomitant use of standardized data formatting tools. We will investigate the benefits and
drawbacks of each approach.
Traditional APIs do not meet
the requirements of this project in several ways. First, they do not lend themselves to be run on a different
physical machine. This should be a high
priority in a distributed environment.
Especially when the subsystems need to be presented as services
available to all, hiding the entry and exit points of the subsystems in API
calls that are inaccessible to users on different machines should be avoided.
Further, using traditional
function calls limits the interoperability with modules written in other
languages. Libraries written in C++ are
difficult to call from Fortran programs, and vice versa, thus further limiting
the utility of the modules as services.
Finally, traditional APIs do
not lend themselves to rigid adherence to data formatting agreed upon by the
two teams of developers. Directed
acyclic graphs (from out example) could be put into data structures, but these
are not human readable (violating the concerns of the Cactus programmers).
The benefits of traditional
function calls are few. They are well
understood by programmers, however.
Programmers may thus feel more comfortable using this technique.
CORBA closes the gap on its
use as a general-purpose service. CORBA
objects are accessible as services over the internet, thus presenting
themselves as useful objects for human intervention at the interfaces. CORBA objects, though, are still
limiting. Their limitations are
orthogonal to the language limitations imposed by APIs. CORBA limitations are platform based, as
they are not readily accessible to the Microsoft platforms. Additionally, since the specifications are
ambiguous, vendors’ implementations are not always compatible and many vendors have
added extensions to the standard. CORBA
objects and IDL specifications are also somewhat difficult and outside of the
comfort zone of many programmers.
Moreover, CORBA interfaces cannot easily be inspected although custom
objects could be created to allow inspection at each module interface.
Standard protocols combined
with standard data formatting tools overcome many of the difficulties of the
other two alternatives. Since we wish
to utilize standard internet protocols, the modules are then inherently
packages as services available to all.
They confer language and platform independence. Since we limit our choices to protocols that
are standardized, there will be many available tools for humans to inspect or intervene
at the module interface boundaries.
Additionally, the choice of standardized protocols implies that the
available tools will aid the development teams in debugging and in ensuring
that the interfaces on both sides will match each other, even though they are
developed in isolation.
The use of standard data
formatting tools ensures that tools will be available to force rigid adherence
of compatible formatting. This will
give a further aid to guarantee that the interfaces match when the modules are
finally put together. Standard data
formatting tools likewise give the human availability to inspect and interfere
at the module interfaces because these tools will allow the data to be
displayed and entered in a human readable format. This aids in program development and debugging as well as giving
the Cactus programmer their required access.
There are two potential
objections to this facility. The first
the question of overhead and the second is the question of whether there is too
big of a learning curve for programmers.
It is true that this option has a slight amount of extra overhead, but
the majority of the overhead is in opening a socket. The technology involved is not daunting and with the internet
technology becoming so ubiquitous, it can be assumed that most programmers will
have been exposed to similar techniques.
Specifics.
A specific standard for both
the protocol and the data-formatting tool must be selected. We propose HTTPS as the protocol of
choice. This protocol is well
understood and there are many tools available to allow inspection or
intervention of the communications.
Next, we turn to selecting a
standard data-formatting tool. This
tool may not bee needed for all communications between modules. Specifically, if an executable is being sent
as the only form of communication, it can be sent as part of the HTTP request
as a MIME type object. But when the
Resource selector sends a directed acyclic graph to the Performance Predictor,
formatting that data in a standardized, human readable format and ensuring that
the data sent adheres to strict formatting rules is essential.
For these purposes, XML seems
an ideal tool. It is a well-established
standard and has tools available for creation, verification, and display on all
platforms. Since many people are
familiar with HTML, XML is quite easy to learn. The only problem with XML is in writing the Document Type
Declaration (DTD), but since that needs to be done only once, aid from an
expert can be enlisted.
Use of XML is preferred over
the creation of a new language specific to the domain. The reason for this is that tools exist to
verify adherence to the language rules in XML, whereas they would have to be
created for a new language and would not be available on all platforms. The Globus team has recognized this fact and
has an alternate to the Resource Specification Language in XML.
Perhaps an example would be
warranted here as the best way to explain why XML could be so useful. The following example gives a hypothetical
markup of a very simplistic directed acyclic graph:
<dag>
<nodes>
<node id="113"
domain="fermat.cs.uchicago.edu"
mflops="166"
mbmemory="128"/>
<node id="242"
domain="euler.cs.uchicago.edu"
mflops="233"
/>
</nodes>
<connections>
<connect id="747"
idorigin="113"
idterminate="242"
mbpersec="50"/>
</connections>
</dag>
Conclusions
We believe that standardized
protocols and data formatting tools will ease the development task of the GrADS
modules and will help to ensure that these modules, developed in isolation,
will interact correctly with each other (avoiding the English Channel tunnel
mishap). We also believe that
utilization of this methodology will allow each module to be used as a service,
fulfilling the requirements of the Cactus developers.