Pramod Kaushik Mudrakarta

Machine Learning Ph.D. Student
The University of Chicago

Email me at

CV Publications


Hello there! I am a Ph.D. student in Machine Learning at the University of Chicago advised by the excellent Prof. Risi Kondor. I also work closely with Ankur Taly and Mukund Sundararajan from Google. My interests in research currently revolve around two distinct but related topics in modern Machine Learning
  • demystifying the black-box nature of Deep Learning models and
  • exploiting hierarchical structure in large data
During the course of my Ph.D. I have had the fortune of doing research internships at Amazon and Google Brain.

I earned my M.Sc. in Computer Science from Saarland University, Germany where my thesis was on designing an efficient algorithm for k-way balanced graph cuts (in other words: multi-class clustering). Before that, I worked as a Research Assistant and Scientist at the Max-Planck Institute in Tübingen, Germany in Dr. Gunnar Rätsch's bioinformatics lab.

I went to IIT Bombay, India for my undergraduate studies in Computer Science.


Analyzing deep neural networks

Hand-authored systems are being increasingly replaced with deep learning systems. While deep neural networks have excellent performance, there is still a large lack of understanding of how they work. Verification and debugging of deep learning models is necessary when deploying these networks in critical applications.
   The goal of this research is to develop techniques for analyzing deep neural networks. In particular, we are developing methods to (a) characterize the input-output behavior of deep learning models (b) craft adversarial examples by exploiting weaknesses in a neural network, (c) strategize improvements and enrich training data, (d) extract rules which may be useful augmentations to existing hand-authored systems, (e) develop debugging and analyzing tools which aid the developer in neural network design, and (f) compress neural networks
(Finer-level details omitted as a couple of papers are currently under review)

This work began during my summer internship at Google Brain.

Collaborators: Ankur Taly, Mukund Sundararajan, Kedar Dhamdhere and Kevin McCurley

Relevant publications:

Parallel multiresolution matrix factorization (pMMF) Project page

The massive size of the datasets occurring in modern machine learning problems necessitates the use of data compression and parallel computation. Data such as those arising from social networks often contain complex hierarchical structure. pMMF is a high performance parallel C++ library that computes a matrix factorization which captures structure at various scales. This can in turn be used for matrix compression and fast determinant and inverse computation, as a preprocessing step for machine learning algorithms and linear solvers. Independently, this method can also be used for constructing a wavelet basis for sparse approximation.

Collaborators: Nedelina Teneva and Risi Kondor, University of Chicago, USA

Relevant publications:

Balanced k-way graph clustering:

Spectral Clustering is a graph partitioning algorithm, based on a continuous relaxation of the NP-hard combinatorial ratio/normalized cut problem. Existing techniques for partitioning a graph into k clusters are either heuristics or recursive schemes based on spectral clustering. These methods fail when applied on large datasets in the real world. We propose an algorithm which directly minimizes the ratio/normalized cut and leads to a k-way partitioning of the data. We also proposed an algorithm for the difficult sum-of-ratios problem which is of independent interest in the optimization community. We performed an extensive experimental evaluation, including on large real world datasets, illustrating that our method indeed outperforms all existing k-clustering algorithms.

Collaborators: Syama Sundar Rangapuram and Matthias Hein, Saarland University, Germany

Relevant publications:


Google Scholar profile

Did the model understand the question?
ACL 2018
Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, Kedar Dhamdhere
PDF Abstract

We analyze state-of-the-art deep learning models for three tasks: question answering on (1) images, (2) tables, and (3) passages of text. Using the notion of attribution (word importance), we find that these deep networks often ignore important question terms. Leveraging such behavior, we perturb questions to craft a variety of adversarial examples. Our strongest attacks drop the accuracy of a visual question answering model from 61.1% to 19%, and that of a tabular question answering model from 33.5% to 3.3%. Additionally, we show how attributions can strengthen attacks proposed by Jia and Liang (2017) on paragraph comprehension models. Our results demonstrate that attributions can augment standard measures of accuracy and empower investigation of model performance. When a model is accurate but for the wrong reasons, attributions can surface erroneous logic in the model that indicates inadequacies in the test data.

It was the training data pruning too!
arXiv preprint 2018
Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, and Kedar Dhamdhere
PDF Abstract

We study the current best model (KDG) for question answering on tabular data evaluated over the WikiTableQuestions dataset. Previous ablation studies performed against this model attributed the model's performance to certain aspects of its architecture. In this paper, we find that the model's performance also crucially depends on a certain pruning of the data used to train the model. Disabling the pruning step drops the accuracy of the model from 43.3% to 36.3%. The large impact on the performance of the KDG model suggests that the pruning may be a useful pre-processing step in training other semantic parsers as well.

A generic multiresolution preconditioner for sparse symmetric systems
arXiv preprint 2017
Pramod Kaushik Mudrakarta and Risi Kondor
PDF Abstract

We introduce a new general purpose multiresolution preconditioner for symmetric linear systems. Most existing multiresolution preconditioners use some standard wavelet basis that relies on knowledge of the geometry of the underlying domain. In constrast, based on the recently proposed Multiresolution Matrix Factorization (MMF) algorithm, we construct a preconditioner that discovers a custom wavelet basis adapted to the given linear system without making any geometric assumptions. Some advantages of the new approach are fast preconditioner-vector products, invariance to the ordering of the rows/columns, and the ability to handle systems of any size. Numerical experiments on finite difference discretizations of model PDEs and off-the-shelf matrices illustrate the effectiveness of the MMF preconditioner.

Multiresolution matrix compression
AISTATS 2016 Notable student paper award (given to top 3 papers)
Nedelina Teneva, Pramod Kaushik Mudrakarta and Risi Kondor
PDF Abstract

Multiresolution Matrix Factorization (MMF) is a recently introduced method for finding multiscale structure and defining wavelets on graphs and matrices. MMF can also be used for matrix compression (sketching). However, the original MMF algorithm of (Kondor et al., 2014) scales with n3 or worse (where n is the number of rows/columns in the matrix to be factorized) making it infeasible for large scale problems. In this paper we describe pMMF, a fast parallel MMF algorithm, which can scale to n in the range of millions. Our experimental results show that when used for matrix compression, pMMF often achieves much lower error than competing algorithms (especially on network data), yet for sparse matrices its running time scales close to linearly with n.

Parallel MMF: a multiresolution approach to matrix computation
arXiv preprint 2015
Risi Kondor, Nedelina Teneva and Pramod Kaushik Mudrakarta
PDF Abstract

Multiresolution Matrix Factorization (MMF) was recently introduced as a method for finding multiscale structure and defining wavelets on graphs/matrices. In this paper we derive pMMF, a parallel algorithm for computing the MMF factorization. Empirically, the running time of pMMF scales linearly in the dimension for sparse matrices. We argue that this makes pMMF a valuable new computational primitive in its own right, and present experiments on using pMMF for two distinct purposes: compressing matrices and preconditioning large sparse linear systems.

Tight continuous relaxation of the balanced k-cut problem
NIPS 2014
Syama Sundar Rangapuram, Pramod Kaushik Mudrakarta and Matthias Hein
PDF Abstract

Spectral Clustering as a relaxation of the normalized/ratio cut has become one of the standard graph-based clustering methods. Existing methods for the computation of multiple clusters, corresponding to a balanced k-cut of the graph, are either based on greedy techniques or heuristics which have weak connection to the original motivation of minimizing the normalized cut. In this paper we propose a new tight continuous relaxation for any balanced k-cut problem and show that a related recently proposed relaxation is in most cases loose leading to poor performance in practice. For the optimization of our tight continuous relaxation we propose a new algorithm for the hard sum-of-ratios minimization problem which achieves monotonic descent. Extensive comparisons show that our method beats all existing approaches for ratio cut and other balanced k-cut criteria.

Oqtans: the RNA-seq workbench in the cloud for complete and reproducible quantitative transcriptome analysis
Bioinformatics 2014
Vipin T. Sreedharan, Sebastian J. Schultheiss, Géraldine Jean, André Kahles, Regina Bohnert, Philipp Drewe, Pramod Kaushik Mudrakarta, Nico Görnitz, Georg Zeller, Gunnar Rätsch
Link to journal page Abstract

We present Oqtans, an open-source workbench for quantitative transcriptome analysis, that is integrated in Galaxy. Its distinguishing features include customizable computational workflows and a modular pipeline architecture that facilitates comparative assessment of tool and data quality. Oqtans integrates an assortment of machine learning-powered tools into Galaxy, which show superior or equal performance to state-of-the-art tools. Implemented tools comprise a complete transcriptome analysis workflow: short-read alignment, transcript identification/quantification and differential expression analysis. Oqtans and Galaxy facilitate persistent storage, data exchange and documentation of intermediate results and analysis workflows. We illustrate how Oqtans aids the interpretation of data from different experiments in easy to understand use cases. Users can easily create their own workflows and extend Oqtans by integrating specific tools. Oqtans is available as (i) a cloud machine image with a demo instance at, (ii) a public Galaxy instance at, (iii) a git repository containing all installed software (; most of which is also available from (iv) the Galaxy Toolshed and (v) a share string to use along with Galaxy CloudMan.

mTim: rapid and accurate transcript reconstruction from RNA-seq data
arXiv preprint 2013
Georg Zeller, Nico Görnitz, André Kahles, Jonas Behr, Pramod Kaushik Mudrakarta, Sören Sonnenburg, Gunnar Rätsch
PDF Abstract

Recent advances in high-throughput cDNA sequencing (RNA-Seq) technology have revolutionized transcriptome studies. A major motivation for RNA-Seq is to map the structure of expressed transcripts at nucleotide resolution. With accurate computational tools for transcript reconstruction, this technology may also become useful for genome (re-)annotation, which has mostly relied on de novo gene finding where gene structures are primarily inferred from the genome sequence. We developed a machine-learning method, called mTim (margin-based transcript inference method) for transcript reconstruction from RNA-Seq read alignments that is based on discriminatively trained hidden Markov support vector machines. In addition to features derived from read alignments, it utilizes characteristic genomic sequences, e.g. around splice sites, to improve transcript predictions. mTim inferred transcripts that were highly accurate and relatively robust to alignment errors in comparison to those from Cufflinks, a widely used transcript assembly method.



GitHub LinkedIn Twitter Facebook