The Blockus project is creating new understanding and
technologies to enable large-scale graph computations. Efforts
include, understanding the behavior of graph algorithms and
graph processing systems, scaling out and vertically (to use
SSD's and other novel non-volatile memories), and exploring
graph computations on with realistic application data usage
Understanding the behavior of graph computations and algorithms - how diverse as a function of algorithm and graph? how can we systematically explore the performance space?
Vertical scalability using new storage-class memories -- can we achieve intelligent storage-hierarchy management to achieve dramatically better
cost-performance and programming flexibility for big data computations
Exploring realistic data sharing and privacy contraints in graph computations -- can we maintain good scalability and performance in the face of such constraints?
We are currently working with GraphX and Graphlab systems.
Our early work was with researchers at HP Labs, who have
created an extension of the R programming system, called
Presto, that supports scale-out parallelism in an extended R
language. In Presto, users express computation on (possibly
sparse) matrix partitions, and the system takes care of the
distribution of data and computation. The work at Chicago
focuses on building on the Presto programming model and
engine, enhancing it to scale vertically: that is creating a
cost-effective, flexible and easy to program system that
handles big data using secondary storage.
Fan Yang and Andrew A. Chien, "
Understanding Graph Computation Behavior to Enable Robust Benchmarking
", in IEEE Conference on High Performance Distributed Computing Conference (HPDC), June 2015.
Systematic Understanding Graph Computation Behavior to Enable Robust Benchmarking,
Master's Thesis, Department of Computer Science, University of Chicago, April 2015.
Fan Yang, T. Hu, C. Imes, R. Suminto, R. Tchoua, H. Zhang, Z. Zhou, and A. Chien. ”Horizontal and Vertical Scalability in Large-scale Graph Processing Systems”, Technical Report, June 2014
- Erik Bodzsar MS Thesis,
Limitations of Data Reuse in Streaming Iterative Algorithms,
S. Venkataraman, E. Bodzsar, I. Roy, A. AuYoung, and R. Schreiber,
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices,
Eurosys '13, April 2013; and more HP Labs
People: Fan Yang,
Andrew A. Chien
(UChicago) Alumni: Erik Bodzsar, Indrajit Roy, Rob Schreiber, Partha Ranganathan (HP Labs)
We gratefully acknowledge support from Hewlett-Packard for the Blockus project.