Acoustic Confusion Matrices

My first graduate school project involved the use of confusion matrices from psycholinguistics experiments. Finding them proved much harder than I expected, and I do not want anyone to go through the amount of work I had to go through getting them.

Many people only know of the Miller-Nicely studies, which is unfortunate. Not because there's anything wrong with them, but because there is so much other stuff out there.

On this site you will also find some of the actual confusion matrices in the original papers. I copied them row by row and checked them column by column so they should be accurate. However, if you are going to use them in a publication, I strongly encourage you to check the original sources first - please don't hold me responsible for any errors! This is stuff you can play with while you're waiting for your local interlibrary loan service to come through with the original papers.

The presence of confusion matrices on this site undoubtedly breaks some copyright laws, but they will remain here until a journal complains. I think that the original authors would be quite happy to have them known more widely. However, if I haven't stated that I've contacted the original authors, I haven't.

There is also well-documented Matlab code to help play with them. I would have written Perl or C++ code, except that I wanted to do plots with these, and Matlab provides a nice interactive environment for that sort of thing.

There is also a list of references to papers relevant to confusion matrices, such as how to analyze them.

Papers and Data

JASA refers to the Journal of the Acoustical Society of America. Once again, you are in all cases strongly encouraged to check the original papers if you ever want to use these matrices in a publication. Even if the matrices are copied correctly (all double checked, but who knows) you shouldn't use a matrix unless you know details of the conditions under which the experiment was done.

  • George Miller and Patricia Nicely ``An Analysis of Perceptual Confusions among some English Consonants'', J.Acoust.Soc.Am. 27:2, 1955.
  • Marilyn DeMorrest Wang and Robert Bilger. "Consonant Confusions in Noise", JASA 54(5):1248-66, 1973.
  • Irwin Pollack and Louis Decker, Consonant Confusions and the constant ratio rule, Language and Speech 3:1-6, 1960
  • Frank Clarke, Constant-Ratio Rule for Confusion Matrices in Speech Communication, JASA 19(6):715-720, 1957.
  • Sadanand Singh and John Black, ``Study of Twenty-six Intervocalic Consonants as Spoken and Recognized by Four Language Groups'', JASA 39:372-387, 1966.
  • Sadanand Singh, ``Crosslanguage Study of Perceptual Confusion of Plosive Phonemes in Two Conditions of Distortion'', JASA 40(3):635-656, 1966
  • Louella W. Graham and Arthur .S. House, "Phonological Oppositions in Children: a Perceptual Study", JASA 49:559-566, 1971.
  • Louis C.W. Pols, ``Three-mode principal component analysis of confusion matrices, based on the identification of Dutch consonants, under various conditions of noise and reverberation'', Speech Communication 2: 275-293, 1983.
  • D.R. van Bergem, ``Acoustic vowel reduction as a function of sentence accent, word stress and word class'', Speech Communication 12: 1-23, 1993.
  • Rais Ahmed and S.S.Agrawal, "Significant features in the perception of (Hindi) Consonants", JASA 45.3:758-763, 1969.
  • J.P.Gupta, S.S.Agrawal, Rais Ahmed, "Perception of (Hindi) Consonants in Clipped Speech", JASA 45.3:770-773, 1969.

I have seen the following papers referred to as sources of confusion matrices.

  • Referring paper: L.C.W.Pols, L.J.Th. van der Kamp and R.Plomp, "Perceptual and Physical Space of Vowel Sounds", JASA 46.2(2):458-467.
    • English low-pass filtered vowels: G.A.Miller, "The Perception of Speech", in "For Roman Jakobson: Essays on the Occasion of his Sixtieth Birthday", M.Halle et al (eds), Mouton and Company, 's-Gravenhage, The Netherlands, 1956, p 353-359.
    • English noise-masked vowels: J.M.Pickett, "Perception of Vowels heard in noises of various spectra", JASA 29:613-620, 1957.
    • B.Mohr and W.S.I.Wang, "Perceptual Distances and the Specification of Phonological Features", Phonetica 18:31-45, 1968.
  • Pilot Studies of Speech Communication In Elementary School Classrooms Abstract only, but refers to confusion matrices from five year olds for consonants in monosyllabic nonsense words.

DISC

The notation used in this site is DISC, which is a format used in the CELEX database. (A quick overview of CELEX files is given by Dirk Janssen.) Again, this is simply a function of the project I was working on at the time. The advantage of DISC is that it uses one character per phoneme (unlike say, ARPABET), which is convenient for programming.

DISC characters are often obvious (eg p means /p/). The following exceptions for English consonants are most relevant for the matrices here:

J means /ch/ (as in the first phoneme of the English word 'cheap'), _ means /dz/ (as in 'jeep'), Z means /zh/ (as in a middle phoneme of 'measure'), S means /sh/ (as in 'sheep'), D means /dh/ (as in 'thy'), T means /th/ (as in 'thigh').

Hugo Quene gives details of the full DISC set, amongst other phoneme transcription formats used by CELEX.

Other Confusion Matrices resources

UCL FIX (Scroll down to Feature Information Xfer) To quote part of their blurb: "a set of programs designed to facilitate analysis of confusion matrices by both ordinary and sequential information transfer analysis (SINFA - Wang & Bilger, 1973, JASA, 54[5] 1248-1266)..."

Matlab Code

Matlab utilities. My other Matlab code often uses this stuff. So if you use any of my other Matlab code, put this in a place whose path is accessible by Matlab.

PHONMAT (updated 1 May 2006) a MATLAB class I've found invaluable for analyzing confusion matrices, etc. There is some documentation.

PHONVEC The 1-dimensional version of PHONMAT. Also invaluable for functional load analysis.

LABELS - a MATLAB class requird by PHONMAT and PHONVEC. For some documentation, go to the bottom of that for PHONMAT.