The strengthening of Grid computing infrastructures as a platform that makes available computational resources and storage capabilities in the form of Internet services with standardized interfaces, has given rise to a new necessity: matching requests with the most appropriate services, with the aim of improving the efficiency in the utilization of the resources in the Grid, and at the same time improving the performance of the jobs. In practice, this need can only be met by developing new scheduling and management mechanisms that achieve a specific differentiation level between services in the Grid, on the basis of their readiness to deliver a specific level of service. Moreover, clients should be enabled to manage the Quality of Service (QoS) requirements of their jobs, within the terms specified in the Service Level Agreements (SLA).
For these reasons, providing support for QoS in Grid computing environments is an active area of research, while very important for the development of general-purpose Grid systems that supports complex business processes.
However, despite the advances made in improving meta-scheduling and resource management in the Grid, the support for QoS in this environment is still incomplete, and at present there is no unique solution for this problem.
In this work, we aim at developing a new model for the allocation of resources in the Grid, on the basis of requirements of QoS. As part of this model, the services in the Grid are periodically evaluated through representative use cases that are executed on the resources with the objective of determining the capability of them to deliver a specific performance and availability. At the same time, the resources must be continuously monitored in order of knowing their status. Using these two measures, we obtain a rank that differentiates the services by their suitability for executing a job with the required QoS. This classification can be used in conjunction with a meta-scheduler to schedule the execution of new jobs in the Grid. The second part of this work is focused to demonstrate the applicability of our model to the solution of a complex problem: the allocation of resources in the Grid for the execution of jobs on the basis of an overall strategy for cost optimization.
The work embodied in the present thesis covers all the phases necessary for the allocation of resources for the execution of jobs in the Grid. We have studied the use of QoS indicators to improve the information that describes the resources in the Grid and to define the requirement of the jobs sent to them, focusing in the conceptual basis provided by the Open Grid Services Architecture (OGSA). In this context, QoS is a measure of the level of service attained, with a group of indicators that characterize the service. These indicators include, but are not limited to: security, bandwidth, average response time, availability of the service, computational power, and memory and storage capability. On this basis, we have proposed an algorithm for the allocation of resources in the Grid that allows the optimization of this process at a global level. Moreover, we present a distributed monitoring system that provides the necessary up-to-date information, including the workload of the resources.
Finally, we present results that support our work. A part of them are computational simulations, but the majority was obtained in real Grid environments, using current Grid middleware. In the same way, we have studied the applicability of our results to use cases and real applications.