. Exploring similarity across features
As stated above, the global similarity estimate is based on five different acoustic features.
You can view the similarity across each feature separately by clicking one of the buttons
in the similarity display group. As noted earlier, asymmetric similarity is estimated in two
stages: first global comparisons across 70ms intervals are used to threshold the match and
detect similarity sections. Then, similarity is estimated locally, based on frame-by-frame
scores. Both local and global distances can be viewed, and those views are useful to
assess what might have gone wrong when the similarity results do not seem reasonable.
For example, you might discover that the pitch is not properly estimated, which will show
similarity across all features but pitch.







The effect of global versus local estimates can be seen in the example below showing FM
and AM partial similarities on local and global scales. Note that locally, FM shows a
similar area in the middle of the matrix where the two sounds are not modulated, and we
can see four bulges emerging from each corner of the central rectangle. Those are the
similarities between the modulated parts of the syllable. Since the syllable is frequency
modulated both in its onset and offset, we have similarity between beginning and end
parts. Now look at the global similarity and note how the rectangle turned into a diagonal
line, which captures the similarity in the transitions from high-low-high FM. In addition,
we see short sidebands, indicating the shorter scale similarity between the beginning of
one syllable and the end of the other. Now examine the partial similarity of AM. Here the
local similarity does not show any similarity between the beginning of one sound and the
end of the other sound, but it does show strong similarity between the two beginnings and
the two ends. This is because the sign of amplitude modulation is positive in the onset
and negative at the offset of each sound. Hence, when looking at the global AM matrix
we do not have sidebands.
Overall, the message is that by comparing similarity across different features we capture
different aspects of the similarity. By taking all those features into account, we can often
obtain a reasonable overall assessment of how good the similarity is, and we might then
also develop some understanding of meaningful articulatory variables that are similar or
different across the two sounds. However, it might also happen that the similarity is good
in some features and poor with respect to others, and in such cases, it might be desired to
omit some features from the global estimate (this is not something you want to do just in
order to obtain a better match!). In the options (similarity tab), you can set different
scales and exclusion of features:
Created using Helpmatic Pro HTML