Here are some confusion matrices from the paper by Marilyn DeMorrest Wang and Robert Bilger, "Consonant Confusions in Noise", JASA 54(5):1248-66.
The files are all ASCII readable. They are also PHONMAT-readable. Here's how you would read them in Matlab:
> blah = phonmat ('cv1main_wb.dat');
> blah.total
pm (object of type PHONMAT) =
title: Consonant Confusions in CV utterances, summed over V = /iau/, noise levels and S/N ratios (Wang & Bilger 1973, Table II)
Phones involved: 16, namely p t k b d g f T (th) s S (sh) v D (dh) z Z (zh) J (ch) _ (dz)
p t k b d g f T s S v D z Z J _ Total
p 933 210 191 16 16 47 98 39 30 17 11 10 26 16 14 24 p 1698
t 245 843 213 35 67 31 68 30 46 25 22 11 12 12 29 24 t 1713
k 324 247 565 30 35 53 89 27 39 40 24 8 15 7 97 102 k 1702
b 136 60 68 486 60 25 191 126 65 5 316 104 33 6 5 9 b 1695
d 33 56 33 178 819 133 53 34 46 11 94 78 62 17 14 42 d 1703
g 20 22 20 82 148 938 15 12 20 12 132 29 57 37 7 162 g 1713
f 198 88 69 91 40 9 765 168 113 19 48 42 28 6 4 11 f 1699
T 107 67 69 107 25 13 667 275 204 29 58 32 29 4 12 14 T 1712
s 44 38 34 51 38 25 150 70 905 57 70 29 157 14 13 15 s 1710
S 26 37 51 17 31 38 26 13 84 870 12 8 34 35 288 130 S 1700
v 34 20 37 204 43 52 84 65 43 13 705 193 155 23 4 27 v 1702
D 37 30 18 239 82 64 49 67 56 10 534 270 192 12 9 28 D 1697
z 20 22 23 72 58 115 31 32 41 12 231 114 787 68 3 84 z 1713
Z 19 20 24 32 66 286 19 22 16 16 77 38 137 420 6 502 Z 1700
J 67 149 92 15 20 14 46 24 144 221 13 8 16 15 829 37 J 1710
_ 20 21 12 30 73 152 8 6 18 39 34 8 106 148 11 1016 _ 1702
The phones involved sometimes have brackets; these indicate alternative (and usually more understandable) phoneme labels. For example, _ means 'dz' as in the middle of the word 'measure'. The one-character labels are, for reasons best called 'historical accident', in CELEX DISC format.
The Wang-Bilger experiments investigated consonant confusions in American English VC and CV syllables. There are 24 and 19 such consonants respectively. Listeners in the experiments recorded their responses by pushing buttons on preprinted boards. However, the only boards available were 4x4 ones, so the experimenters split each of the CV and VC syllable sets into two, and compiled separate confusion matrices for each.
CV was split into
VC was split into
There were also two types of experiment, a 'control' and a 'main' one. 'Control' is something of a misnomer, since it was more a preliminary experiment than a 'placebo' one.
The control experiment for each type of syllable stimulus (e.g. /p/ in CV) involved 156 tokens of that syllable played to each of 6 listeners, for a total of 936 responses. It was done to get a feeling of appropriate signal levels when no noise was present. The 156 tokens comprised 12 syllables presented at each of 13 noise levels. The 12 syllables were evenly split among 3 vowels, e.g. 4 were /pa/, 4 were /pi/ and 4 were /pu/.
The main experiment for each type of syllable stimulus involved 432 tokens of that syllable played to each of 4 listeners for a total of 1728 stimuli. The 432 tokens comprised 18 syllables presented at each of 6 S/N and 4 noise levels. The 18 syllables were also evenly split among the same 3 vowels as in the control experiment.
For reasons I don't understand, not all the rows sum to the same number. First I thought that was because some responses were not recorded. However, the sum of all sums of all rows comes out to a nice round number, suggesting (as does a line or two in the paper) that stimuli were actually not equally distributed, only approximately so. Therefore the figures of 156 and 432 should be taken as approximations.
These figures were manually typed from the paper and double-checked. Still, use only for preliminary experiments, and get a copy of the original paper to check them.