Title: Scientific Computing in Python
Speaker: Andy R. Terrel
Date: 9 Mar 2009
Abstract: Python is slow, consistently quoted 10-20 times slower than C, so why do so many people use it for scientific computing? A trend that has been steadily increasing across the fields is using Python for everything from building prototypes, running large parallel jobs, to even replacing industry proven standards like Matlab. I will take you through a whirlwind tour of the possibilities in Python for scientific computing.
So-called "network science" has seen a tremendous surge in the past decade, mostly due to the increase in available datasets (from food webs and genetic networks to the web, the internet, and social networks) and the increase in computing power that makes it possible to analyze those datasets. Network motifs -- subgraphs that occur more frequently than would be expected based on a randomized model -- are supposedly the building blocks of these networks, in the way that, say, NAND gates or ALUs are the building blocks of microprocessors. I will talk about whether or not network motifs truly are the building blocks of real-world networks, and how we might be able to tell. Along the way, I'll discuss some of the similarities and differences between network science and classical graph theory that make network science so exciting, and I'll tell a story of how forcing a theorist (me) to actually implement an algorithm led to the discovery of a better algorithm and a new property of real-world networks.
Title: Modeling and Testing in PLT Redex
Speaker: Casey Klein
Date: 12 January 2009
Term-rewriting systems transform an input term by successively rewriting it according to a set of directed equations. Such systems have applications in automated theorem provers, computer algebra systems, and the semantics of programming languages. This talk introduces term-rewriting by building a model of the card game "war" in PLT Redex, a domain-specific language for specifying context-sensitive rewriting systems, highlighting Redex's support for visualization and randomized testing.
Title: An Introduction to Programming on GPUs
Speaker: Lars Bergstrom
Date: 1 December 2008
Graphics Processing Units (GPUs) are processors attached to video cards that are optimized for doing parallel floating-point computation. With the release of packages like NVidia's CUDA, it is possible to write C-like code that shifts some of your math-intensive code onto the video card to gain significant speedups. This talk will provide an overview of the programming model, a walkthrough of some sample code showing how to take advantage of the hardware, and anecdotal data on what types of problems will and will not benefit from migration to the GPU.
Cloud Computing has become another buzzword after Web 2.0. However, there are dozens of different definitions for Cloud Computing and there seems to be no consensus on what a Cloud is. On the other hand, Cloud Computing is not a completely new concept; it has intricate connection to the relatively new but thirteen-year established Grid Computing paradigm, and other relevant technologies such as utility computing, cluster computing, and distributed systems in general. This talk strives to compare and contrast Cloud Computing with Grid Computing from various angles and give insights into the essential characteristics of both.
Title: Object Recognition with Deformable Part Models
Speaker: Ross Girshick
Date: 27 October 2008
Object recognition is the task of locating and labeling object classes (e.g., bicycle or person) in images. It remains as one of the most challenging problems in computer vision. While the problem is far from solved, in recent years deformable part models have achieved excellent results on challenging data sets. This talk provides an introduction to deformable part models by first introducing object recognition with a rigid template. We will discuss a broad range of issues including how to represent image data, templates as linear classifiers, composing templates into part models, efficient object detection when the part model's graph is a tree, and finally learning deformable part models with the Latent SVM algorithm. We will conclude with experimental results from the 2008 PASCAL VOC Challenge.
Abstract: ML is an excellent language for writing concise, high-level programs. As a case study, we will consider the design and implementation of an abstract game-playing system such that the specification of the game is kept entirely separate from that of the decision strategy; thus games and strategies are able to be mixed and matched freely in particular programs. We will incidentally see how an ML program can match a mathematical specification of an algorithm very closely.
Time permitting, we will also consider some combinatorics problems whose solutions are nicely expressed in ML, in the domains of music theory and/or the card game Set.
This seminar will include live demonstrations of building software with ML. The intended audience is students and enthusiasts of computer science who have heard of, say, ML, algebraic datatypes, module systems, Hindley-Milner typechecking, functional programming, and so on, but have little or no direct experience reaping the multifarious benefits of these technologies.
Abstract: In theoretical computer science, issues of encoding are usually swept under the rug. Do you write your number in unary, binary, or ternary? Well, as long as we stay away from unary we don't care. Do you store your graphs as adjacency matrices, or as edge lists? Well, it won't affect whether or not a problem has a polynomial-time solution, so we don't really care. But if one computational problem is just a re-encoding of another problem, then they're really the same problem. So maybe we should care a little bit.
The ubiquity of NP-complete problems is well-known to anyone who's taken an introductory course in theoretical computer science. By now over 3000 problems have been proved NP-complete. But what they don't tell you -- sometimes even in upper level theory courses -- is an observation made in 1976, just a few years after NP-completeness was first defined, that all known NP-complete problems are just re-encodings of one another. They're not just reducible to one another, they are the same problem.
In this talk, I'll discuss a formal notion of re-encoding, give a hint of why the above statement is true, and discuss what this potentially deep observation might mean.
Title: Expander Graphs (A Gentle Introduction)
Speaker: Eric Purdy
Date: 28 Apr 2008
Abstract: Expander graphs are one of the most powerful general tools in theoretical computer science. In this talk, we will sketch several roughly equivalent definitions of expansion, and try to give some intuition for why they are so important. Expander graphs are "quasi-random"--in many ways, they "look random". They can be used to simulate randomness, allowing us to decrease the amount of randomness we use in a randomized algorithm. We will also see a simple example of an infinite family of expander graphs, and discuss their connection to random walks.
Title: On Interviews
Speaker: Lars Bergstrom
Date: 15 Apr 2008
Abstract: Having interviewed several hundred job candidates from ages 18 to 60, mid-college to Ph.D. and heavily published researchers, I've seen just about every terrible mistake that keeps an otherwise potentially qualified candidate from making a good impression. Come learn about what interviewers and companies are really doing and looking for during an interview, how to avoid "instant no-hire" actions, and what you should do to prepare. Shocking interviewee anecdotes guaranteed!
Title: A Gentle Introduction to Grid Computing
Speaker: Borja Sotomayor
Date: 18 Feb 2008
Abstract: Grid Computing enables computational resources (clusters, supercomputers, workstation pools, storage, etc.) to be shared across administrative domains, allowing researchers from a variety of fields (such as high energy physics, bioinformatics, climate research, astrophysics, etc.) to easily tap into the resources of multiple sites. Many research projects benefit from using grids, from large-scale projects like CERN's Large Hadron Collider to a variety of smaller projects served by general-purpose grids such as the Open Science Grid.
This talk provides a gentle introduction to Grid Computing, and does not require a background in computer science (although some parts will be slightly technical).
Abstract: We will show how to assign small (log n bits long) weights to the edges of a bipartite planar graph so that the minimum weight perfect matching is unique. The procedure to assign the weights works in deterministic Logspace whereas a randomized weighing scheme is known for long time for arbitrary bipartite graphs (Mulmuley, Vazirani and Vazirani, isolation lemma). As a consequence we improve the parallel complexity of bipartite planar matching. Deterministic parallel complexity of constructing a perfect matching in (not necessarily bipartite) planar graph is still open.
Joint work with Samir Datta and Sambuddha Roy.
Abstract: The final stage of solving a partial differential equation is almost always solving some system of linear equations. Some of the most popular approaches to solving linear systems are iterative; they converge to an answer, usually within some tolerance, after a number of steps. The class of iterative methods known as relaxation methods are very simple to implement and fairly robust, but have convergence rates highly dependent upon the eigenspectrum of the system being solved. Multigrid methods are a class of iterative solvers where coarser versions of the linear system are used to decimate lower frequency parts of the error which decay very slowly if the whole system is considered. The allure of these methods is that under the right conditions they take only O(n) work to solve a system of n unknowns, making it possible to solve very large problems very quickly. Several classes of multigrid methods and their applications will be briefly discussed.
Title: Compiling Manticore
Speaker: Adam Shaw
Date: 7 Jan 2007
Abstract: This talk concerns the implementation of Manticore's parallel constructs. Data parallel programs are compiled into a language that can be thought of as core ML plus the primitives future, touch, and cancel. We use ropes as an important internal representation and a good fit for parallel computations. We will also discuss the mixture of exceptions and exception handlers with data parallelism, a novel feature of the Manticore design.
Abstract: Simulation has become a cornerstone of scientific advancement in all fields, but often large scale projects use limited methods and techniques due to choices of data management at the beginning of the project. Furthermore, small projects do not allow for a robust testing of new methods because of the small problem sizes and scopes. By automating the writing of simulations we are able to take advantage of new methods that would otherwise take too long to develop in a large scale project and small projects would be able to generate large problems very easily. I present several mathematical interfaces that are being studied here in Chicago and applied to partial differential equation solvers using finite element methods.
Title: Tracking Hundreds of Nano-objects
Speaker: Leandro Cortes
Date: 5 Nov 2007
Abstract: In this talk I will present an algorithm to analyze digital image sequences made by total internal reflection fluorescence microscopy (TIRFM). The problem is to track tens or hundreds of vesicles moving near the surface membrane of a cell and identify the time and place of exocytosis.
I will describe the physics and the biology behind the problem. Then, I will show a statistical model for the video data that allows us to properly weight various hypotheses online. Finally, I will describe the algorithm: a sequence of coarse-to-fine tests, derived from the statistical model, that are used to detect and track the objects of interest.
Title: Multi-Unit Auctions with Unknown Supplies
Speaker: Sourav Chakraborty
Date: 29 Oct 2007
Abstract: Mahdian and Saberi studied the problem of multi-unit auctions for perishable goods, in a setting where supplies arrive in an online fashion. This problem is motivated by an application in advertisement auctions on the internet. They gave a $1/4$-competitive algorithm for the problem with single-price auctions and truthful bidding.
We have got a better algorithm. In the talk I will introduce the problem and state the results.
Title: Metaprogramming With Traits
Speaker: Aaron Turun (undergraduate)
Date: 21 May 2007
Abstract: One hope of software engineering is that by organizing a large program well, it can be written as if it were a collection of small programs. In object-oriented languages, which dominate industrial application programming, classes are the main unit of organization. But it is increasingly recognized that classes are not enough: it would often be helpful to write code that applies across, through, between, within, or outside of classes. The result is a thriving ecosystem of language constructs, including mixins, traits, aspects, and many others. I will briefly review some of these constructs, and the problems they solve. Then I will present my own research, a simple extension to traits that costs little but buys much. To demonstrate the approach, I will show how to, with a minimal amount of fairy dust, shrink a class from several pages to a few lines of code. Finally, for the PL nerds, I will (very briefly) discuss the mixture of nominal and structural subtyping needed to support the proposal in the context of a language like Java.
Title: Testing Graph Isomorphism
Speaker: Sourav Chakraborty
Date: 14 May 2007
Abstract: Graph Isomorphism is one of the most well-studied problems in computer science. In general we need lots of ``resources" for graph isomorphism. But what happens if we have the guarantee that either the two graphs are ``close" to isomorphic or ``far" from isomorphic. I will try to study this problem under various models of computation, mainly query complexity and communication complexity.
Abstract: To enable the rapid execution of many tasks on compute clusters, we have developed Falkon, a Fast and Light-weight tasK executiON framework. Falkon uses (1) multi-level scheduling to separate resource acquisition (via, e.g., requests to batch schedulers) from task dispatch, and (2) a streamlined dispatcher. Multi-level scheduling, introduced in operating systems research in the 1990s, has been applied to clusters by the Condor team and others, while streamlined dispatchers are found in volunteer computing systems. Falkon's integration of the techniques delivers performance not provided by any other system. We describe Falkon architecture and implementation, and present performance results for both microbenchmarks and applications. Microbenchmarks show that Falkon throughput and scalability are one to two orders of magnitude better than other schedulers. Applications (executed by the Swift parallel programming system) reduce end-to-end run time of up to 90% for large-scale astronomy and medical applications, relative to versions that execute tasks via separate scheduler submissions.
Title: Online Decision Making
Speaker: Varsha Dani
Date: 2 Apr 2007
Abstract: Everyday life is full of decisions: Should you dress for sunny weather or rain? Drive to work, or take public transport (or walk)? If you drive should you take the freeway or local roads? Or the freeway part of the way, and local roads the rest, and if so which part? Will the stock market go up (so that you should invest) or down? Chinese, Indian, or Italian food for dinner? Or pizza? Should you attend this talk, or skip it because it will be boring? Many such decisions are recurrent: the same choices are available every day, but which decision is "correct" can vary from day to day. How can you learn from past mistakes to make the right decisions in the future?
In this talk, I will present a model for such online decision-making problems and various algorithms to tackle them. I will also discuss what it means for such algorithms to perform well.
Title: Finite Elements Revisited
Speaker: Andy R. Terrel
Date: 12 Mar 2007
Abstract: Finite Element Analysis is often presented in a continuous mathematical framework with derivations using functional analysis. This sort of analysis does not address major implementation issues and is often left aside by many treatments. We will explain finite elements for the non expert and then motivate several challenges to their implementations. Finally we borrow some ideas from topology to discuss the management of the simulation for the entire method. Hopefully this talk will be a gentle overview and motivate several open areas accessible to a computer scientist.
Title: Reversing Sink Source Pairs in Multiple Unicast Network Coding
Abstract: Network coding has been proposed as a technique to increase the
throughput of a network. With network coding nodes are allowed to send
functions of their inputs instead of forewarding. Interest in network
coding started when it was shown that network coding matches the
min-cut lower bound in the single source setting[1]. Multiple unicast
means there are sink source pairs, and each sink requests the
information of exactly one source. It is conjectured that Network
Coding does not have an advantage for multiple unicast over undirected
graphs[2]. In this talk I will show the proof of a weaker statement
than the conjecture: for linear network coding, we can reverse a sink
source pair and maintain the same capacity.
This is joint work with Michael Langberg (The Open University of Israel).
References:
Title: Learning from Labeled and Unlabeled Examples
Abstract: Consider the task of getting a computer to automatically classify web
documents into several categories: news, sports, finance, politics,
research etc. Traditional machine learning algorithms construct such
classifiers from a set of manually labeled training examples. Labeling
is slow, costly and often impractical due to the sheer amounts of data
available. Semi-supervised learning is a new paradigm where
classifiers are constructed by utilizing large amounts of unlabeled
data together with a small set of labeled examples. In this talk, I
will present two classes of algorithms: Manifold Regularization and
Low-density Separation, which extend classical kernel methods (such as
Support Vector Machines) for semi-supervised learning.
Speaker: Paolo Codenotti
Date: 26 Feb 2007
[1]: R. Alshwede, N. Cai, S.-Y.R. Li and R. W. Yeung, "Network
information flow,"
IEEE Trans. Info. Thy, vol. 46, pp. 1204-1216, Feb., 2000.
[2]: Z. Li, B. Li, "Network Coding in Undirected Networks," in Proc. of the
38th Annual Conference on Information Sciences and Systems (CISS), 2004.
Speaker: Vikas Sindhwani
Date: 12 Feb 2007