Research
What are my research interests?
The short answer: I am interested in virtual machine-based resource provisioning models (where "resource" includes hardware, software, and time) using a leasing abstraction. Although I'm not fond of buzzwords, most of my recent work could be described as relevant to Infrastructure-as-a-Service (IaaS) "cloud computing", since IaaS clouds use virtual machines to provision computational resources and my work deals with how to (1) map heterogeneous user requests (best effort, advance reservations, immediate availability, etc.) to virtual machines and (2) provision those virtual machines efficiently. Since a lot of my work involves writing resource scheduling code, my secondary interests include parallel job scheduling and scheduling performance metrics.
The long answer:
The need for computational resources has, over the years, become a fundamental requirement in both science and industry. In many cases, this need is transient: a user may only require computational resources for the duration of a well-defined task. For example, a scientist could require a large number of computers to run a simulation for just a few hours, but might not need those computers at any specific time (as long as they are made available in a reasonable amount of time). A college instructor may want to make a cluster of computers available to students during the course's lab sessions, at very specific times during the week, and with a specific software configuration. A telecommunications company could posses an existing infrastructure that hosts a number of websites, but may need to supplement that infrastructure with additional resources during periods of unforeseen increased web traffic, meaning those resources have to made available right away with very little advance notice.
These transient resource usage scenarios pose the problem of how to provision shared computational resources efficiently. This problem has been studied for decades, resulting in approaches that tend to be highly specialised to specific usage scenarios. For example, the problem of how to run multiple jobs on a shared cluster has been extensively studied, resulting in job management systems systems like Torque/Maui, Sun Grid Engine, LoadLeveler, and many others, that can queue and prioritise job requests efficiently (in these systems, efficiency is defined in terms of a variety of metrics, including waiting times and resource utilization). Such a system would meet the requirements of the scientist wanting to run simulations during a few hours but, on the other hand, the college instructor and the telecommunications company mentioned above would be ill-served by a job management system and the efficiency metrics typically used in job management. Conversely, other resource provisioning approaches are not particularly well suited for job-oriented computations.
Thus, there is no general solution that can provision resources meeting the requirements of different usage scenarios simultaneously, such as those mentioned above, reconciling the different measures of efficiency in each scenario. For example, take the combination of best-effort resource requirements, where a user needs computational resources but is willing to wait for them (possibly setting a deadline), and advance reservation resource requirements, where the resources must be available at a specific time. In the former, efficiency is typically measured in terms of waiting times (or similar metrics such as turnaround times or slowdowns) or throughput, while the latter is usually concerned with providing the requested resources at exactly the agreed-upon times without interruption, and both are concerned with maximizing the use of hardware resources and possibly monetary profit. Although both best-effort and advance reservation provisioning have been studied separately, the combination of both is known to produce utilization problems and is discouraged in practice.
In my research I seek to develop a resource provisioning model and architecture that can support multiple resource provisioning scenarios efficiently and simultaneously, with an initial focus on the best-effort and advance-reservation cases mentioned above, and arguing in favour of a lease-based model, where leases are implemented as virtual machines (VMs). This model must meet the following goals:
- Provide an abstraction focused solely on resource provisioning. Although the lease abstraction has been used in multiple fields of computer science, most notably networking, there is no universally accepted definition of "lease". However, leases generally always provide an abstraction for, first and foremost, provisioning a resource (bandwidth in networks, raw hardware resources in datacenters, etc.) operated by a lessor (or resource provider and provided to a lessee (or resource consumer), with relatively few restrictions on how the provisioned resources can be used. So, when proposing a lease-based model, the implied goal is that resource consumers will be able to use a general-purpose resource provisioning abstraction (i.e., not one that is coupled to a particular use case).
- Provision hardware, software, and availability. Resource provisioning can encompass three dimensions: hardware resources, the software available on those resources, and the time during those resources must be guaranteed to be available. A complete resource provisioning model must allow resource consumers to specify requirements across these three dimensions, and the resource provider to efficiently satisfy those requirements. As stated earlier, my work focuses on best-effort and advance reservations availability requirements.
- Reconcile requirements of different types of leases. Best-effort and advance reservation provisioning have different measures of efficiency and, in some cases, these measures will be in conflict. For example, accepting advance reservations leases unconditionally may delay or even preempt best-effort leases but, on the other hand, a policy of not allowing best-effort leases to be delayed or preempted may reduce the number of advance reservations that can be accepted. Reconciling these measures of efficiency requires developing scheduling algorithms capable of combining both types of leases, and potentially others, and policies that can guide the scheduling decisions based on the goals and requirements of the resource provider. Taking into account the different overheads of virtual machines adds an additional layer of complexity to the problem of scheduling VM-based leases.
- Model virtual resources accurately and schedule them efficiently. The choice of virtual machines to implement leases requires modelling virtualized resources and operations on those resources. In particular, using virtual machines involves different types of overhead (most notably the overhead of transferring virtual machine images, and the overhead of suspending and resuming virtual machines) that must be accurately modelled so they can be taken into account when scheduling virtual machines.
Summing up, the main contribution of my dissertation will be a resource provisioning model that uses leases as a fundamental abstraction and virtual machines as an implementation vehicle. This contribution can be further divided into a formal specification of lease terms, a model of virtualized resources, a lease management architecture (including scheduling algorithms and policies), and metrics of efficiency for heterogeneous workloads combining multiple types of leases. As a technological contribution, my dissertation will also provide an open-source reference implementation of the lease management architecture, capable of operating in simulation or on real hardware.
For more details, please see my publications or my dissertation.
Publications
Please see the publications page.
Current projects
Haizea
Haizea is an open-source VM-based lease management architecture (if that sounds like a mouthful, take a look at the What is Haizea? page). In a nutshell, Haizea is a piece of software that, in combination with the OpenNebula virtual infrastructure manager, can be used to manage a Xen or KVM cluster, allowing you to deploy different types of leases that are instantiated as virtual machines (VMs). Haizea can also be run in simulation, providing a platform for experimenting with scheduling algorithms that depend on VM deployment or on the leasing abstraction. I am the lead developer for the Haizea project.
Reservoir (EU FP7 project)
In 2008 and 2009 I did summer internships for Reservoir, a project that "will enable massive scale deployment and management of complex IT services across different administrative domains, IT platforms and geographies". As part of my internships, I worked with the Distributed Systems Architecture group at the Universidad Complutense de Madrid. My work revolved mainly around the Haizea project mentioned above. Although I am not currently employed by Reservoir, my work is still done in close collaboration with members of the Reservoir project.
Past projects
Workspace Service
I was part of the group that develops the Workspace Service component of the Globus Toolkit 4. In particular, I was involved in VM-based "virtual workspaces". To quote, the Workspace Service website: "A virtual workspace is an abstraction of an execution environment that can be made dynamically available to authorized clients by using well-defined protocols. The abstraction captures resource quota assigned to such execution environment on deployment (such as CPU or memory share) as well as software configuration aspects of the environment (such as operating system installation or provided services). The Workspace Service allows a Grid client to dynamically deploy and manage workspaces."
CrossGrid
I was a Research Associate for a short while (with affiliation to the Instituto de Física de Cantabria in Santander, Spain). My main job was writing and proofreading tutorial material.
BOOLE-DEUSTO
BOOLE-DEUSTO is a software aid for Digital Electronics courses. It helps and guides the student through typical exercises: minimization of boolean functions, Veitch-Karnaugh maps, design and simulation of finite-state machines, circuit diagrams (combinational and sequential), etc. I was involved in this project from 2000 to 2004 as Lead Programmer. The project was developed at the University of Deusto and lead by Professor Javier García Zubía.