In this section, we describe at a high level the primary components of a grid environment. Depending on the grid design and its expected use, some of these components may or may not be required, and in some cases they may be combined to form a hybrid component. However, understanding the roles of the components as we describe them here will help you understand the considerations when developing grid-enabled applications.
Just as a consumer sees the power grid as a receptacle in the wall, a grid user should not see all of the complexities of the computing grid. Although the user interface can come in many forms and be application-specific, for the purposes of our discussion, let's think of it as a portal. Most users today understand the concept of a Web portal, where their browser provides a single interface to access a wide variety of information sources. A grid portal provides the interface for a user to launch applications that will use the resources and services provided by the grid. From this perspective, the user sees the grid as a virtual computing resource just as the consumer of power sees the receptacle as an interface to a virtual generator.
The current Globus Toolkit does not provide any services or tools to generate a portal, but this can be accomplished with tools such as WebSphere® Portal and WebSphere Application Server.
A major requirement for grid computing is security. At the base of any grid environment, there must be mechanisms to provide security, including authentication, authorization, data encryption, and so on. The Grid Security Infrastructure (GSI) component of the Globus Toolkit provides robust security mechanisms. The GSI includes an OpenSSL implementation. It also provides a single sign-on mechanism, so that once a user is authenticated, a proxy certificate is created and used when performing actions within the grid. When designing your grid environment, you may use the GSI sign-in to grant access to the portal, or you may have your own security for the portal. The portal will then be responsible for signing in to the grid, either using the user's credentials or using a generic set of credentials for all authorized users of the portal.
Once authenticated, the user will be launching an application. Based on the application, and possibly on other parameters provided by the user, the next step is to identify the available and appropriate resources to use within the grid. This task could be carried out by a broker function. Although there is no broker implementation provided by Globus, there is an LDAP-based information service. This service is called the Grid Information Service (GIS), or more commonly the Monitoring and Discovery Service (MDS). This service provides information about the available resources within the grid and their status. A broker service could be developed that utilizes MDS.
Once the resources have been identified, the next logical step is to schedule the individual jobs to run on them. If a set of stand-alone jobs are to be executed with no interdependencies, then a specialized scheduler may not be required. However, if you want to reserve a specific resource or ensure that different jobs within the application run concurrently (for instance, if they require inter-process communication), then a job scheduler should be used to coordinate the execution of the jobs. The Globus Toolkit does not include such a scheduler, but there are several schedulers available that have been tested with and can be used in a Globus grid environment. It should also be noted that there could be different levels of schedulers within a grid environment. For instance, a cluster could be represented as a single resource. The cluster may have its own scheduler to help manage the nodes it contains. A higher level scheduler (sometimes called a meta scheduler) might be used to schedule work to be done on a cluster, while the cluster's scheduler would handle the actual scheduling of work on the cluster's individual nodes.
If any data -- including application modules -- must be moved or made accessible to the nodes where an application's jobs will execute, then there needs to be a secure and reliable method for moving files and data to various nodes within the grid. The Globus Toolkit contains a data management component that provides such services. This component, known as Grid Access to Secondary Storage (GASS), includes facilities such as GridFTP. GridFTP is built on top of the standard FTP protocol, but adds additional functions and utilizes the GSI for user authentication and authorization. Therefore, once a user has an authenticated proxy certificate, he can use the GridFTP facility to move files without having to go through a login process to every node involved. This facility provides third-party file transfer so that one node can initiate a file transfer between two other nodes