Borja's University of Chicago page

Borja

Welcome!

You have arrived at Borja Sotomayor's webpage at the University of Chicago. I am a 5th year PhD candidate in the Department of Computer Science. In particular, I am a member of the Distributed Systems Laboratory, and my advisor is Ian Foster. My research deals with virtual machine-based resource provisioning models using a leasing abstraction. As part of my research work, I maintain the open-source Haizea project, a VM-based lease manager that, in combination with the open-source OpenNebula virtual infrastructure manager, can be used to lease hardware resources using VMs (Xen and KVM are currently supported, with VMWare support to follow shortly).

I'm originally from Bilbao, a wonderful city in the Basque Country region of Spain.

This page includes all my university-related stuff (CV, publications, projects I'm working on, etc.). If that kind of stuff bores you, then you might be more interested in visiting my weblog: BorjaNet (beware, it's in Spanish!) or my gallery.

Teaching

Research interests

The short answer: I am interested in virtual machine-based resource provisioning models (where "resource" includes hardware, software, and time). Since a lot of my work involves writing resource scheduling code, my secondary interests include parallel job scheduling and scheduling performance metrics.

The long answer:

The problem of provisioning computational resources to users has resulted in several approaches that target different usage scenarios. For example, when a scientist requires computational resources for an application, he can submit it as a batch job to a local cluster, or to remote resources through grid interfaces. When a freelance web developer needs a web server for months, or perhaps years, he can lease a dedicated server in a datacenter. When a college instructor wants to teach a course on parallel programming, and needs a small dedicated cluster for only a few hours each week, he can obtain one from Amazon EC2. However, these solutions are specialized to a specific usage scenario, only partially supporting other usage patterns (if at all), and there is currently no resource provisioning model, or system, that can support all these usage scenarios at the same time.

My research focuses on developing a resource provisioning model for remote resources, such as those found on a grid, that uses leases as a fundamental abstraction (as opposed to the widespread job abstraction used on grids, where resource provisioning happens as a side effect of submitting a job). A lease is a negotiated and renegotiable agreement between a resource provider and a resource consumer, where the former agrees to make a set of resources available to the latter, based on a set of lease terms presented by the resource consumer. The lease terms encompass the hardware resources required by the resource consumer, such as processors, memory, network bandwidth, etc., a software environment required on the leased resources, and an availability period, during which the hardware and software resources requested by the user are guaranteed to be available. Part of my work involves developing scheduling algorithms that can efficiently support workloads that interleave different types of availability periods, including best-effort workloads, exact availability periods (with a specific start and end time), urgent availability, etc.

My work uses virtualization technologies to implement leases. Virtual machines (VMs) are promising since they can (1) allow physical hardware to be carved up between multiple leases, which will be isolated from each other, (2) have enforceable resource allocations, (3) support custom software environments, and (4) be suspended, potentially migrated, and resumed. However, additional challenges arise when using VMs, mostly stemming from the different types of overhead involved in using them, such as the overhead of managing and deploying a large quantity of potentially large VM images. Part of my work involves analysing and quantifying the tradeoffs of using VMs, and taking them into account in scheduling algorithms so they will have a minimal impact on performance.

For more details, please see my publications.

Current projects

Haizea
Haizea is an open-source VM-based lease management architecture (if that sounds like a mouthful, take a look at the What is Haizea? page). In a nutshell, Haizea is a piece of software that, in combination with the OpenNebula virtual infrastructure manager, can be used to manage a Xen or KVM cluster, allowing you to deploy different types of leases that are instantiated as virtual machines (VMs). Haizea can also be run in simulation, providing a platform for experimenting with scheduling algorithms that depend on VM deployment or on the leasing abstraction. I am the lead developer for the Haizea project.

Reservoir (EU FP7 project)
In 2008 I did a summer internship for Reservoir, a project that "will enable massive scale deployment and management of complex IT services across different administrative domains, IT platforms and geographies". As part of my internship, I am working with the Distributed Systems Architecture group at the Universidad Complutense de Madrid. My work revolved mainly around the Haizea project mentioned above. Although I am not currently employed by Reservoir, my work is still done in close collaboration with members of the Reservoir project.

Publications and Talks

Please see the publications page.

Workshops and tutorials I've taught

Teaching in the American Classroom, a panel discussion (I was one of three panelists in this session) in the 2007 Workshop on Teaching in the College, University of Chicago. September 18 and 19, 2007.

Entornos Grid Basados en Globus Toolkit 4. July 4-6, 2007. Universidad Complutense de Madrid (Madrid, Spain). 15-hour course on GT4 service programming with the Introduce IDE. This course is a part of Curso Superior de Administración, Explotación y Programación de Sistemas Grid (3ª Edición), a 100-hour summer course on Grid Computing.

Computación Grid. June 18-29, 2007. Universidad de los Andes (Bogotá, Colombia).

The FileBuy Globus Based Resource Brokering System - A Practical Example. September 15, 2006. GlobusWORLD 2006, Washington D.C. (USA). [website]

Entornos Grid Basados en Globus Toolkit 4. July 3-7, 2006. Universidad Complutense de Madrid (Madrid, Spain). 20-hour course on GT4 programming. This course is a part of Curso Superior de Administración, Explotación y Programación de Sistemas Grid (2ª Edición), a 100-hour summer course on Grid Computing.

Entornos Grid Basados en Globus Toolkit 4. July 6-12, 2005. Universidad Complutense de Madrid (Madrid, Spain). 25-hour course on GT4 programming. This course is a part of Curso Superior de Administración, Explotación y Programación de Sistemas Grid, a 100-hour summer course on Grid Computing.

Evolución de Globus. June 23, 2004. Instituto de Física de Cantabria (Santander, Spain). 2-hour presentation on the evolution and future trends of the Globus Toolkit. This presentation was a part of Grids y e-Ciencia, a 30-hour postgraduate course on Grid Computing. [slides]

Sistemas Grid Basados en GT3. March 3-5, 2004. Centro de Supercomputación de Galicia (Santiago de Compostela, Spain). 15-hour course on GT3 programming. [slides 1 2 3 4]

Curriculum Vitae

[Download CV] Please note that I do not update the CV as frequently as this website (CV last updated: 03/12/08). Although an updated CV is available upon request, please take into account that I am currently not accepting employment solicitations.

Past projects

Workspace Service
I was part of the group that develops the Workspace Service component of the Globus Toolkit 4. In particular, I was involved in VM-based "virtual workspaces". To quote, the Workspace Service website: "A virtual workspace is an abstraction of an execution environment that can be made dynamically available to authorized clients by using well-defined protocols. The abstraction captures resource quota assigned to such execution environment on deployment (such as CPU or memory share) as well as software configuration aspects of the environment (such as operating system installation or provided services). The Workspace Service allows a Grid client to dynamically deploy and manage workspaces."

CrossGrid
I was a Research Associate for a short while (with affiliation to the Instituto de Física de Cantabria in Santander, Spain). My main job was writing and proofreading tutorial material.

BOOLE-DEUSTO
BOOLE-DEUSTO is a software aid for Digital Electronics courses. It helps and guides the student through typical exercises: minimization of boolean functions, Veitch-Karnaugh maps, design and simulation of finite-state machines, circuit diagrams (combinational and sequential), etc. I was involved in this project from 2000 to 2004 as Lead Programmer. The project was developed at the University of Deusto and lead by Professor Javier García Zubía.

Other interests

I am also very interested in the following, although they're not part of my research:

Miscellaneous

Here's some miscellaneous stuff which you might find amusing (or not) and which might reveal a little bit more about me (or not).

My geek code

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/E/ED/IT d-(--) s:-(:) a- C++++$ UL++$>++++ P+ L+++$ !E- W+++
N(+) !o K? w--- !O !M !V PS++ PE- Y PGP t(+) !5 !X !R tv(+) b+(++)$ 
DI(+) D+ G+ e+++>++++ h r y?
------END GEEK CODE BLOCK------

Decode geek code.

My favorite blogs

WIL WHEATON dot NET. Great insight into the life of a struggling actor and (more recently) a successful author.

Kirai.NET - Un geek en Japón [in Spanish]. Extremely interesting blog which showcases the cultural differences a European finds while living in Japan.

Planet e-GHOST [in Spanish]. Blogs written by members of e-GHOST (see below).

Slashdot. News for nerds. Stuff that matters.

Barrapunto [in Spanish]. The Spanish Slashdot.

The Leaky Cauldron. A must-read for all muggles.

You can also visit my Bloglines public profile to see all the blogs I'm subscribed to.

Sites I visit practically every day

CNN.com (International edition). To keep informed of what's happening around the world.

El Correo Digital. To keep informed of what's happening back at home.

Daryl Cagle's Professional Cartoonist's Index. For my daily dose of satirical humor.

Bloglines. To keep track of all my favorite blogs.

Miscellaneous miscellaneous stuff

University of Deusto. Ah, yes, my alma mater :-) I studied Computer Engineering in Deusto (1998-2003) and was a junior faculty member in the Department of Software Engineering (2003-04) before coming to Chicago.

e-GHOST (ESIDE's GNU, High-tech, & Open Source Team). This is the University of Deusto's Free Software group (yes, I know, the acronym is confusing), whose mission is to keep the academic community informed of the benefits of Free Software. I was very active in the group for a couple years (organizing events, summer courses, etc.) but since I am now a considerable distance from Deusto, I merely lurk around the mailing list.

Contact information

On the University of Chicago campus, I am located in Ryerson 257-C. Although I am usually in my office during regular business hours, the best way to contact me is through e-mail: borja AT cs DOT uchicago DOT edu. I compulsively check my e-mail several times a day and usually respond within 24-48 hours. If you want to drop by my office, please let me know in advance to make sure that I am available that date/time.