Dinoj Surendran

mugshot In December 2007, I graduated from the Department of Computer Science at the University of Chicago with a doctorate. My advisor was Gina-Anne Levow, and the other members of my committee were Partha Niyogi and Chilin Shih (UIUC).

I am now at Microsoft Research as part of the WorldWide Telescope team. (If that sounds unusual, bear in mind that my second life in grad school involved making pretty pictures for astronomers and other researchers.)

In my doctoral work (PDF of thesis), I used tools from machine learning and information theory to analyze and automatically recognize tones in Mandarin Chinese. I considered questions like :

  • If you are tone-deaf, how much information would you lose in trying to understand Mandarin? (Answer: about the same as if all vowels sounded alike.)
  • Is voice quality helpful with tone recognition? (Not all measures are; the energy in various frequency bands does help with the recognition of neutral and low tone.)
  • Are strongly articulated syllables easier to recognize? (In news broadcast speech, yes, but not by much.)

I have become more interested in the applied machine learning aspect of my thesis of my work, particularly while working on NAFLA, a C++ library for fast scalable multiclass classification with probability estimates.

Data used for PhD Thesis

The 9 Nov 07 issue of Science has an article by Adrian Cho on Ultra-High-Energy Cosmic Rays found by the Pierre Auger project that uses a picture made by COSMUS (myself, Mark SubbaRao, Randy Landsberg) and the Adler Planetarium Space Visualization Lab using simulations from Argentinian physicist Sergio Sciutto. Cool, eh?

Start every day off with a smile and get it over with. - W.C. Fields

World Time Zones : Chicago Weather

I've worked on several projects, some of which are listed below.

Natural Language Processing

NAFLA : C++ library for fast multiclassification with probability estimates. If you want easy-to-use code that can do a 5-class 27-dimensional problem with 120 000 training examples in three minutes, use this.

Mandarin tone recognition (Vodcast)

Prosody in dialog act classification

Functional Load : Determining the use of phonemic features in language e.g. if you can not recognize tones in Mandarin Chinese, the effect is comparable to not being able to distinguish between vowels in English (or German or Dutch).


Often done in collaboration with members of the COSMUS group at the Adler Planetarium and Kavli Institute for Cosmological Physics.

Flypathmaker : Perl tools to create camera flypaths in Partiview to make regular, 3d Stereo, and full-dome (planetarium) movies.

Ndaona : Matlab package to produce Partiview models of classification and dimensionality reduction algorithms.

Air Shower Visualization : creating interactives and movies of cosmic ray showers simulated by astrophysicists, particularly of the Pierre Auger Observatory.

Seeing the Large Scale Structure of the Universe : visualizing 100 000s of galaxies mapped by the Sloan Digital Sky Survey. (SkySkan Press Release, Google Video link)

Viz Demos created for other people

Clustering patterns of handwritten digits under various algorithms for Machine Learning researchers to use at conferences and talks (e.g. Misha Belkin of Ohio State, Partha Niyogi of Chicago, Dmitris Achlioptas of Microsoft).

Star Observations : creating interactive visuals of star observations used by UCLA astronomers to find the Black Hole at the center of the Milky Way.

Planetary Russian Dolls : An interactive model of Planetary Scales to use in classrooms. Apparently quite popular with the middle school crowd.

Visualizing the effect of Laplacian Eigenmaps on a moving body; done for a postdoc friend at the Stanford Mechanical Engineering Department working on Markeless Motion Detection.

EarthGraph : Visualizing global computer networks. Done for Matei Ripeanu (now at U British Columbia) while he was a grad student working on Planetlab data.


Setting up TikiWiki for the Wiki of the Chicago '05 Machine Learning Summer School

Using Kevin Murphy's CRFall toolkit for the case when all sequences are the same length.

Making a poster with LaTeX (a modification of Stephen Eglen's files to make it work for Chicago instead of Cambridge)

Various Partiview How-Tos e.g. GeoWall usage, making interactive Globe demos, navigation, etc. I designed the Partiview web page too. . . .

Things I believe

Documentation is important.

Cats are superior to dogs, and often to humans too.

Perl is rather cool.

Salads are immoral.

It is difficult to produce a visualization that is both useful and pretty.

Storyline will always be more important than the media in which the story is told.

I would rather make a poker player than play poker.