firstname at cs period uchicago period edu Sravana Reddy

I am a Ph.D. student in Computer Science at the University of Chicago. My research interests are primarily in the unsupervised learning of natural language structure, covering a variety of problems in NLP, speech, and linguistics. John Goldsmith is my advisor. I also work with Karen Livescu at TTI Chicago and Kevin Knight at ISI.

I will be graduating in the summer of 2012, and looking for research postdocs or teaching positions. My application materials are available by e-mail.


[Research]     [Education and Experience]     [Teaching Experience]

Conference Publications :: Theses and Manuscripts :: Refereed Presentations :: Software and Data

Research Papers

Conference Publications

  • Sravana Reddy and Evandro Gouvêa. 2011.
    Learning from Mistakes: Expanding Pronunciation Lexicons Using Word Recognition Errors.
    In Proceedings of Interspeech.
    [paper] [abstract] [bib] [slides]

    Abstract

    We introduce the problem of learning pronunciations of out-of-vocabulary words from word recognition mistakes made by an automatic speech recognition (ASR) system. This question is especially relevant in cases where the ASR engine is a black box -- meaning that the only acoustic cues about the speech data come from the word recognition outputs. This paper presents an expectation maximization approach to inferring pronunciations from ASR word recognition hypotheses, which outperforms pronunciation estimates of a state of the art grapheme-to-phoneme system.

    .bib

    @inproceedings{Reddy:2011c,
      title = {Learning from Mistakes: Expanding Pronunciation Lexicons Using Word Recognition Errors},
      author = {Sravana Reddy and Evandro Gouv\^{e}a},
      booktitle = {Interspeech},
      year = {2011},
    }
    

  • Sravana Reddy and Kevin Knight. 2011.
    Unsupervised Discovery of Rhyme Schemes.
    In Proceedings of ACL.
    [paper] [abstract] [bib] [slides] [rhyming corpus] [code]

    Abstract

    This paper describes an unsupervised, language-independent model for finding rhyme schemes in poetry, using no prior knowledge about rhyme or pronunciation.

    .bib

    @inproceedings{Reddy:2011a,
      title = {Unsupervised Discovery of Rhyme Schemes},
      author = {Sravana Reddy and Kevin Knight},
      booktitle = {ACL},
      year = {2011},
    }
    

  • Sravana Reddy and Kevin Knight. 2011.
    What We Know About The Voynich Manuscript.
    In Proceedings of the ACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities.
    [paper] [abstract] [bib] [slides] [note]

    Abstract

    The Voynich Manuscript is an undeciphered document from medieval Europe. We present current knowledge about the manuscript's text through a series of questions about its linguistic properties.

    .bib

    @inproceedings{Reddy:2011b,
      title = {What We Know About The {V}oynich {M}anuscript},
      author = {Sravana Reddy and Kevin Knight},
      booktitle = {ACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities},
      year = {2011},
    }
    
    A note on the conclusions of our work

    We have presented a statistical overview of the text that covers a number of questions; however, this is obviously not comprehensive, and there are many more avenues to explore. One point that has been raised about this paper is that we claim that the characters represent alphabets in some writing system. This is not the case. We do focus our analysis around this possibility (since it's the most obvious one on a first look at the manuscript) and present some interpretations accordingly, but we are careful to highlight the statistics, which are plain truths, and remain agnostic about what they imply. In particular, we do not arrive at a conclusion about the nature of the writing system or the underlying text.

  • Sravana Reddy and John Goldsmith. 2010.
    An MDL-based Approach to Extracting Subword Units for Grapheme-to-Phoneme Conversion.
    In Proceedings of NAACL.
    [paper] [abstract] [bib]

    Abstract

    We address a key problem in grapheme-to-phoneme conversion: the ambiguity in mapping grapheme units to phonemes. Rather than using single letters and phonemes as units, we propose learning chunks, or subwords, to reduce ambiguity. This can be interpreted as learning a lexicon of subwords that has minimum description length. We implement an algorithm to build such a lexicon, as well as a simple decoder that uses these subwords.

    .bib

    @inproceedings{Reddy:2010,
      title = {An {MDL}-based Approach to Extracting Subword Units for Grapheme-to-Phoneme Conversion},
      author = {Sravana Reddy and John Goldsmith},
      booktitle = {NAACL},
      year = {2010},
    }
    

  • Sravana Reddy and Sonjia Waxmonsky. 2009.
    Substring-based Transliteration with Conditional Random Fields.
    In Proceedings of the ACL Named Entities Workshop (Shared Task).
    [paper] [abstract] [bib]

    Abstract

    Motivated by phrase-based translation research, we present a transliteration system where characters are grouped into substrings to be mapped atomically into the target language. We show how this substring representation can be incorporated into a Conditional Random Field model that uses local context and phonemic information. Our training and test data consists of three sets: English to Hindi, English to Kannada, and English to Tamil (Kumaran and Kellner, 2007) – from the NEWS 2009 Machine Transliteration Shared Task (Li et al., 2009).

    .bib

    @inproceedings{Reddy:2009b,
      title = {Substring-based Transliteration with Conditional Random Fields},
      author = {Sravana Reddy and Sonjia Waxmonsky},
      booktitle = {ACL Named Entities Workshop},
      year = {2009},
    }
    

  • Sravana Reddy. 2009.
    Understanding Eggcorns.
    In Proceedings of the NAACL Workshop on Computational Approaches to Linguistic Creativity.
    [paper] [abstract] [bib]

    Abstract

    An eggcorn is a type of linguistic error where a word is substituted with one that is semantically plausible – that is, the substitution is a semantic reanalysis of what may be a rare, archaic, or otherwise opaque term. We build a system that, given the original word and its eggcorn form, finds a semantic path between the two. Based on these paths, we derive a typology that reflects the different classes of semantic reinterpretation underlying eggcorns.

    .bib

    @inproceedings{Reddy:2009a,
      title = {Understanding Eggcorns},
      author = {Sravana Reddy},
      booktitle = {NAACL Workshop on Computational Approaches to Linguistic Creativity},
      year = {2009},
    }
    

Theses and Manuscripts.

  • Learning Pronunciations from Unlabeled Data. [e-mail for copy]
    Dissertation Proposal, The University of Chicago, 2011.

  • Part of Speech Induction Using Non-negative Matrix Factorization. [e-mail for copy]
    Masters' Thesis, The University of Chicago, 2009.

Refereed Presentations

  • Sravana Reddy and Gregory Crane. 2006.
    A Document Recognition System for Early Modern Latin. [e-mail for abstract]
    At Chicago Colloquium on Digital Humanities and Computer Science.

Software and Data

I try to make my research code available. At some point, I will clean everything up and release it. Meanwhile, here is what I have on github. Contributions in terms of bugfixes or expansions are very welcome.

  • Rhyme scheme discovery
  • Poetry corpus manually annotated with rhyme schemes (in collaboration with Morgan Sonderegger)

back to top


Education :: Internships :: Academic Service

Education and Experience

Education

Ph.D. in Computer Science. The University of Chicago, in progress.
M.S. (part of Ph.D) in Computer Science. The University of Chicago, 2009.
- McCormick Fellowship
B.S. in Computer Science, Mathematics, Creative Writing. Brandeis University, 2006.
- Wien International Scholarship, Schiff Fellowship, Highest Honors

Internships

Information Sciences Institute, University of Southern California. Summer 2010 and Summer 2011.
Mitsubishi Electric Research Laboratories (MERL). Summer 2009.
Perseus Digital Library, Tufts University. Summer 2006.
The Robotics Institute, Carnegie Mellon University. Summer 2005.

Academic Service

Reviewer for ACL, EMNLP, ICASSP, NAACL, Speech Communication.
Committee on Student Issues and Concerns, Linguistic Society of America.
Member of ACL, LSA, ISCA.

Lab Instructor :: Teaching Assistant :: Grader

Teaching Experience

Lab Instructor

Fundamentals of Programming (Fall 2011) [Lab Materials]
Intro to Computer Science (Fall 2009, Fall 2010)
Intro to Programming for the World Wide Web (Spring 2010) [Lab Materials]
Distributed Objects (Spring 2009)

Teaching Assistant

Artifical Intelligence (Winter 2012)
Computational Biology (Winter 2011)
Intro to Computer Science (Winter 2009, Winter 2010)
Computational Linguistics (Fall 2008)
Fundamentals of Programming (Winter 2008)
Intro to WWW Programming (Spring 2008)
Foundations of Software (Fall 2007)

Grader

Calculus (Fall 2004, Spring 2005)