Lecturer's Précis - Cherry (1953)
"Some
experiments on the recognition of speech, with one and with two ears"
Copyright Notice: This material was
written and published in Wales by Derek J. Smith (Chartered Engineer). It forms
part of a multifile e-learning resource, and subject only to acknowledging
Derek J. Smith's rights under international copyright law to be identified as
author may be freely downloaded and printed off in single complete copies
solely for the purposes of private study and/or review. Commercial exploitation
rights are reserved. The remote hyperlinks have been selected for the academic
appropriacy of their contents; they were free of offensive and litigious
content when selected, and will be periodically checked to have remained so. Copyright © 2002-2018, Derek J. Smith.
|
First published online 08:11 BST 24th June 2002,
Copyright Derek J. Smith (Chartered Engineer). This version [2.0 - copyright] 09:00 BST 4th July
2018.
Cherry's (1953)
Dichotic Listening Research
This was a classic study into the cognitive system's ability to deal with competing auditory inputs. The following definitions and distinctions are important:
Monaural vs Binaural: To hear
with one or two ears respectively. In normal circumstances, all sound sources
are processed binaurally, and the auditory system is very good at using the
microscopic differences in time of arrival (it can detect differences down to
30 millionths of a second) and sound intensity to compute the direction a sound
is coming from.
Diotic vs Dichotic: Diotic is the same as binaural, and means hearing one, two, or more, sound sources with two ears, and then making sense of where that/those sound(s) are coming from. Dichotic, on the other hand, refers to the artificially generated state of hearing a different sound with each ear, as when messages are presented through earphones (Trimble, 1931). The dichotic listening paradigm described below is therefore good at pushing the perceptual system beyond its natural limitations in an attempt to see more clearly how it is put together.
Against this background, Cherry (1953) conducted six sets of experiments, as follows:
·
The Basic "Mixed Message" Paradigm: In the
first two series of experiments, Cherry investigated how we recognise what one
person is saying when others are speaking at the same time, a situation he described
as "the 'cocktail party problem'" (p976). Subjects were presented
with two different spoken messages, recorded onto a single audiotape (ie. "mixed", in a tape
editing sense) by the same speaker, and played back via headphones. Both
messages were thus simultaneously and equally available to both ears, thus
approximating to real life competitive conversation. Subjects were then
instructed to repeat one of the messages word by word or phrase by phrase.
Cherry's observations were (a) that subjects reproduced at phrase level, rather
than word level, and (b) that there were extremely few transpositions of
material from the to-be-rejected message, except where the competing
sentence structures accidentally presented the subject with a high transition
probability transposition. Subjects generally reported great difficulty with
the task, but this eased appreciably if they were allowed to make written
notes.
·
Predictability: In this series of experiments,
Cherry arranged for the mixed material to be full of clichés, that is to say,
"highly probable phrases" such as "the time has come to stop
beating around the bush". His observation was that output tended to
consist of whole clichés, and that recognition of just the first one or two
words of a stock phrase would typically prompt the entire phrase. Successful
message separation, however, was "impossible", and a cliché from one
message would as likely as not be followed by one from the other message. [Bear
in mind that the strings of clichés within each message did not create a
particularly sensible overall message, so there was no strong narrative theme
to hold the parts together.]
·
The Basic "Unmixed Message" Paradigm: In the
remaining sets of experiments, subjects were presented with two different
spoken messages, recorded onto separate audiotapes (ie.
"unmixed" in a tape editing sense) by the
same speaker, and played back by headphones, one message to each earpiece.
Unlike the mixed message paradigm, each ear now only heard one message. Again,
subjects were instructed to repeat one of the messages (always the right ear
message) as accurately as possible. Cherry's general observations were (a) that
subjects could switch between messages at will, (b) that they could repeat the
selected message easily and accurately, but slightly delayed, (c) that their
speaking voice became monotonous, with "little emotional content or
stressing of the words", (d) that they remained unaware of this, (e) that
they "may have very little idea" what the message was all about, and
(f) that they took in very little about the content of the rejected message.
Indeed, if the language of the unattended message was changed from English to
German a few seconds into the trial, once shadowing of the target message had
been successfully established, that change was not usually detected. This
observation prompted further investigation of what sort of information, if any,
was available from the rejected message .....
·
Penetration of the Rejected Message: In this
series of experiments, Cherry looked at what information, if any, remained
available to the listener from an otherwise unattended message. He arranged for
the unattended left ear message to change from its normal (male spoken English)
once the trial was under way. His observations were (a) that a change from
forward speech to backward speech (same sound profile, but zero lexical or
semantic content) was noticed as "something queer about it" by some
subjects but not noticed at all by others, (b) that a change from male to
female voice was "nearly always" identified, (c) that a change to a
400 Hz tone was always noticed, and (d) that subjects could not say with
certainty what language was being used.
·
Same Message, Time Delayed: In this
series of experiments, Cherry wished to investigate the mechanisms by which the
brain decides whether the messages arriving at the ears is from a single
source, a state of affairs he referred to as "correlated". The point
is that when two inputs are correlated, they need to be merged internally, despite
naturally occurring ear-to-ear differences in intensity and arrival time,
whilst when they are from different sources one of them needs to be rejected
internally. He therefore presented an identical message to each ear, but with
the left (to be rejected) delayed relative to the right (to be shadowed). This
was achieved by running a single length of pre-recorded audiotape through two
physically separated tape players. The second tape player was then gradually
moved closer to the first, thus reducing the playback delay. Cherry's
observations were that "nearly all" subjects eventually recognised
words or phrases from the rejected message as matching those in the attended
ear. Cherry remarks that this is actually quite surprising, given that when
different messages are used nothing is available from the rejected ear. The
delay at which such recognition took place varied considerably between
subjects, but was typically 2-6 secs.
References
Cherry,
E.C. (1953). Some experiments on
the recognition of speech, with one and with two ears. Journal of the
Acoustical Society of America, 25(5):975-979.
Trimble, O.C.
(1931). Concerning the meaning of the terms diotic and dichotic. American Journal of
Psychology, 43:144.