Falkon: a Fast and Light-weight tasK executiON framework
Falkon [2] is a software component that is a Globus Incubator project as of November 2007. Falkon is being actively developed at University of Chicago, Computer Science Department in the Distributed Systems Laboratory under Ian Foster's guidance with funding from DOE and NASA.
The initial idea and motivation behind Falkon emerged in 2005 with an astronomy application (an image stacking/co-adding service, aka AstroPortal) at its core. The application involved an image dataset Sloan Digital Sky Survey (SDSS) DR4/DR5 of 10TB and involved 1000s to 10000s of object retrievals from this dataset, calibrating these objects, and finally stacking them to produce the final output. This application was faced with several challenges when being run in tradition Grid Computing environments:
long queue times in batch-scheduled systems, typically longer than the job duration times
slow job dispatch rates, production local resource managers (LRMs) typically achieve on the order of 1 job/sec dispatch rates
poor scalability of shared file systems as most large scale systems aren't balanced equally between storage resources and computational resources
After an initial prototype of the astronomy specific application AstroPortal was complete, work began on generalizing the system so other applications could benefit from a system that could address the 3 challenges mentioned. In January 2007, the first prototype of Falkon v0 was complete, and testing began with other applications. In order to leverage a large pool of applications to use Falkon transparently, Falkon coordinated its efforts with the Swift project (a parallel programming system) and created a Falkon Provider to be included with Swift. Applications from many domains (astronomy, medicine, chemistry, and economics) were tested and showed significant performance improvements. Furthermore, the project is extending well beyond traditional grids by being ported to the IBM BlueGene/P that will be online in March 2008 at Argonne National Laboratory.
There have been many people that contributed (ideas, writing, code, etc) to Falkon; my contributions to Falkon have been the leading of the project in general, as well as the designed and implementation of the core Falkon functionalities, to enable the rapid and efficient execution of many independent jobs on large compute clusters. Falkon combines three techniques to achieve this goal: (1) multi-level scheduling techniques to enable separate treatments of resource provisioning and the dispatch of user tasks to those resources; (2) a streamlined task dispatcher able to achieve order-of-magnitude higher task dispatch rates than conventional schedulers; and (3) performs data caching and uses a data-aware scheduler to leverage the co-located computational and storage resources to minimize the use of shared storage infrastructure.
The latest stable release of Falkon is currently at v0.9.r11. More detailed information about Falkon and to download the code for free, please visit Falkon Globus Incubator page at http://dev.globus.org/wiki/Incubator/Falkon.
Collaborators:
Presentations:
Papers:
Yong Zhao, Ioan Raicu, Ian Foster, Mihael Hategan, Veronika Nefedova, Mike Wilde. “Realizing Fast, Scalable and Reliable Scientific Computations in Grid Environments”, to appear as a book chapter in Grid Computing Research Progress, ISBN: 978-1-60456-404-4, Nova Publisher 2008.
Ioan Raicu. “Harnessing Grid Resources with Data-Centric Task Farms”, University of Chicago, Computer Science Department, PhD Proposal, December 2007, Chicago, Illinois.
Ioan Raicu, Yong Zhao, Catalin Dumitrescu, Ian Foster and Mike Wilde. “Falkon: A Proposal for Project Globus Incubation”, Globus Incubation Management Project, 2007 – Proposal accepted 11/10/07.
Ioan Raicu, Ian Foster. “Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets: Year 1 Status and Year 2 Proposal”, NASA GSRP Year 1 Progress Report and Year 2 Proposal, Ames Research Center, NASA, February 2007 -- Award funded 10/1/07 - 9/30/08.
Ioan Raicu, Yong Zhao, Ian Foster, Alex Szalay. “A Data Diffusion Approach to Large Scale Scientific Exploration”, to appear in the Microsoft Research eScience Workshop 2007.
Ioan Raicu, Yong Zhao, Catalin Dumitrescu, Ian Foster, Mike Wilde. “Falkon: a Fast and Light-weight tasK executiON framework”, to appear at IEEE/ACM SuperComputing 2007.
Ioan Raicu, Catalin Dumitrescu, Ian Foster. “Dynamic Resource Provisioning in Grid Environments”, TeraGrid Conference 2007.
Yong Zhao, Mihael Hategan, Ben Clifford, Ian Foster, Gregor von Laszewski, Ioan Raicu, Tiberiu Stef-Praun, Mike Wilde. “Swift: Fast, Reliable, Loosely Coupled Parallel Computation”, IEEE Workshop on Scientific Workflows 2007.
I. Raicu, I. Foster. “Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets”, NASA GSRP Proposal, Ames Research Center, NASA, February 2006 -- Award funded 10/1/06 - 9/30/07.
Ioan Raicu, Ian Foster, Alex Szalay. “Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets”, poster presentation, IEEE/ACM SuperComputing 2006.
Ioan Raicu, Ian Foster, Alex Szalay, Gabriela Turcu. “AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis”, TeraGrid Conference 2006, June 2006.
Alex Szalay, Julian Bunn, Jim Gray, Ian Foster, Ioan Raicu. “The Importance of Data Locality in Distributed Computing Applications”, NSF Workflow Workshop 2006.
Relevant Links to related projects or sub-projects of Falkon:
![]()