shannonian-theory

The Relevance of Shannonian Communication Theory to Biological Communication

Copyright Notice: This material was written and published in Wales by Derek J. Smith (Chartered Engineer). It forms part of a multifile e-learning resource, and subject only to acknowledging Derek J. Smith's rights under international copyright law to be identified as author may be freely downloaded and printed off in single complete copies solely for the purposes of private study and/or review. Commercial exploitation rights are reserved. The remote hyperlinks have been selected for the academic appropriacy of their contents; they were free of offensive and litigious content when selected, and will be periodically checked to have remained so. Copyright © 2003-2018, Derek J. Smith.

First published [v1.0] 08:22 GMT 3rd March 2003; this version [2.0 - copyright] 09:00 BST 7th July 2018.

Earlier versions of this material appeared in Smith (1997a), Smith (1997b; Chapter 1), and Smith (2000). It is collated here with minor amendments and supported with hyperlinks.

1 - The Concept of Information

It was Ralph Vinton Lyon Hartley (1888-1970), an American electrical engineer, who first attempted to quantify information (Hartley, 1928). Previously, if you had measured information at all you had done it by the sentence-full, or the chapter-full, or the encyclopaedia-full, and while you may have had a lot of it or a little of it, you could not have counted it in any meaningful way. Hartley addressed this problem by pointing out that information is only information if it tells you something you did not already know. He argued that messages which truly informed had to come as a surprise, so to speak, and that if you knew what a message was going to be telling you, then despite any superficial length and complexity that message was actually conveying no information whatsoever. So in order to measure information, you had to start counting the number of things you were being told, relative to the number of things you did not know, and that, in turn, meant counting how many signs you had in your message relative to how many signs there were available in the vocabulary your message was drawn from. Hartley proceeded to quantify this approach in a long and complex mathematical argument [details], culminating in the following equation .....

I = N log S

Where I is the amount of information each message contains, N is the number of signs in a particular message, and S is the number of different signs in your vocabulary.

The task of formally defining information was then taken up by Claude Elwood Shannon (1916-2001) of the Bell Telephone Company (Shannon, 1948/2003 online; Shannon and Weaver, 1949). Drawing on Hartley's mathematical analysis, Shannon, too, saw information as that which reduced uncertainty. Cherry (1957) explains it this way:

"When we ourselves communicate one with another, we transmit signals []. Now it is customary to speak of signals as 'conveying information', as though information were a kind of commodity. But signals do not convey information as railway trucks carry coal. Rather we should say: signals have an information content by virtue of their potential for making selections." (Cherry, 1957:169; emphasis original.)

In other words, information enables a correct selection to be made from a set of possible alternatives, and provided these possible alternatives are equally likely, the amount of information contained in any message is the logarithm (to base 2) of the number of choices possible. So the equation for I becomes .....

I = log₂ C

Where I is the amount of information each message contains, and C is the number of possible choices. [A look-up table of C and log₂ C is presented in Figure 1.]

The unit of I has a special name. It is called the "bit", which is short for "binary digit".

Key Concept - The Bit: The bit is the technical measurement of information. It is the amount of information necessary to make a choice between two equally likely alternatives.

So how does this sort of binary encoded information work? Well imagine that you have spun a coin, but cannot see whether it has fallen heads or tails. You ask: "Is it heads?" The reply message can consist only of the words "yes" or "no", and each (on this occasion) is equally probable. Substituting in <I = log₂ C>, we find we have <log₂ 2> bits of information in that message, and <log₂ 2> is 1. So you need one bit of information to service a two-way decision.

What then makes the <I = log₂ C> concept really useful is the fact that strings of individual bits can be used to quantify more complicated choice situations. Suppose, for example, that you needed to represent a four-way decision (which suit from a deck of cards to lead with in a game of Bridge, perhaps). Repeating our previous calculation, we find that we need <log₂ 4> bits of information, and <log₂ 4> is 2. So you need two bits of information to service a four-way decision. You might even like to think of these two bits as being "used up" one by one, with each one halving the number of decisions remaining. Thus four halves to two, and two halves to one, and when you get down to a one-way decision there is no longer a decision to be made. Two halvings - two bits. Easy.

And how are we to represent all these bits? Well given that in binary arithmetic you only ever see the digits 0 and 1, you simply "code" each two-way decision accordingly. In the examples above, the heads-tails decision could be represented as 0 for heads and 1 for tails (it could just as easily be the other way round, of course, and it really makes no difference provided everyone involved works to the same codes and conventions). The suit of cards decision can be dealt with similarly, only this time we need a two-bit string of noughts and ones. Thus 00 would represent spades, 01 would represent clubs, 10 hearts, and 11 diamonds. And we can then call each of these discrete permutations a "binary word".

Suppose now that you need to represent an eight-way decision (which suit to play, as before, but now also whether you wanted to lead high or low). To do this you would need to resort to three-bit words. Thus 000 would represent a high spade, 001 a low spade, 010 a high club, 011 a low club, 100 and 101 the high and low hearts, and 110 and 111 the high and low diamonds. You have eight possible decisions to make and eight possible codes to play with. Again easy.

Now note the pattern which is emerging. The binary word lengths go 1-2-3-4....., but the information content - the "decidability", if you like - goes 2-4-8-16..... There is a rapid increase in alternatives with only a slight increase in word length (which happens to be the principal characteristic of any logarithmic relationship, so we should be neither surprised nor unduly concerned). The pattern can be seen more graphically in Figure 1 below.

Figure 1 - Bit-Strings and Encoding: This table shows the binary word length (right) necessary to support a given C-alternative decision (left).

Number of Alternatives (C)	Binary Word Length (log₂ C)
1	0
2	1
4	2
8	3
16	4
32 NB: The fact that there are only 32 different five-bit words explains why the commercial five-key telegraph system [see under Baudot, Morkrum, and Morkrum-Kleinschmidt in our paper on the history of computing - click here] could only ever offer a 32-item alphabet [details].	5
64	6
128	7
256	8
512	9
1024	10
2048	11
4096	12

So to summarise, whenever we find ourselves faced with having to code a given number of alternatives.....

1 We count them, rounding upwards to the next whole power of two. [The data in Figure 1 will assist in this.]

2 We record that power of two as our binary word length.

3 We write down all the individual binary words this gives us.

Tip: There is an easy way to do this. For the eight-way decision previously considered, the eight possible three-bit words are 000, 001, 010, 011, 100, 101, 110, and 111. To get the 16 four-bit words, simply prefix the eight three-bit words once by 0 and then again by 1. To get the 32 five-bit words, prefix the four-bit words once by 0 and then again by 1. And so on, according to this pattern.

4 We allocate one of these binary words to each of our alternatives.

Exercise 1 is intended to reinforce the foregoing body of concepts.

Exercise 1 - Binary Encoding

1 List the 16 points of the compass [help] and allocate a different four-digit binary word to each.

2 Allocate a different four-digit binary word to each of the 26 letters of the alphabet.

3 Imagine you are a computer keyboard designer. State the binary word length necessary to code 26 upper-case letters, 26 lower-case letters, 10 digits, and 20 miscellaneous punctuation marks.

4 You want to transmit 10 letters per second from your keyboard. What is the transmission rate in bits/sec?

5 Research the ASCII code, listing its binary codes for the characters "A", "a", "8", and "*".

6 State the binary word length necessary to support a 500-way decision.

7 Assuming there are 4000 common words in the English language, state the binary word length necessary to code each one of these words differently.

8 You want to transmit two of these words per second. What is the transmission rate in bits/sec? Explain the difference between answers (4) and (8).

9 Given the sentence "the cat sat on the mat," and using the binary word length from (7), allocate an imaginary binary code for each word. Then write down the concatenated (ie. "strung together") binary string necessary to communicate the entire sentence. How many bits does this contain?

10 If you translated sentence (9) into ASCII, how many bits would be required? Explain the difference between answers (9) and (10).

2 - Shannonian Communication Theory and the Concept of the Idealised Communication System

There is such a rich variety of signalling systems open to human ingenuity that it finally occurred to theorists that perhaps they all had something in common, and here we again meet Claude Shannon of the Bell Telephone Company. Having worked for some time on the telecommunications aspects of information, Shannon (1948) finally derived the concept of the idealised, or "general", communication system. The principles of information and its transmission, he argued, did not vary from the semaphore to the cable telegraph to the telephone to the wireless - there was an underlying logical pattern to them all. Shannon and his patron Warren Weaver presented these arguments in more detail in a monograph entitled "A Mathematical Theory of Communication" (Shannon and Weaver, 1949), and identified a coherent set of basic elements as set out in Figure 2.

Figure 2 - The Idealised Communication System (One-Layered): This is the classical version of the Shannonian communication channel, described in Shannon (1948) as a "schematic diagram of a general communication system". It is an abstract analysis of the key components of all the two-party communication systems known to humankind, and involves passing a "message" from a "source" (the first of the minds involved) to a "destination" (the second of the minds involved). The diagram shows this as a left to right flow between the two yellow highlighted boxes. The transmission proper requires passing a "signal" (physical energy) along a "transmission channel". This channel can take any form, provided only that it is capable of carrying the signal from a "transmitter" (light brown highlight, left) to a "receiver" (light brown highlight, right), a device capable of detecting the signal, and of converting it back into a meaningful form. The signal is whatever physical activity happens to be induced by the transmitter in the transmission channel, but in order to be detected accurately it has to overcome the effects of "noise". Noise is any activity within the channel not intended by the message source. It is everything other than the signal itself, and it acts to degrade the communication. It might be noise in the everyday usage of the word, or it might be interference on your TV picture, or mist if you are using a semaphore, etc., etc. The "signal-to-noise ratio" is a measure of how easy it is for the receiver to detect the incoming signal against the background of noise. Where the signal power is large compared to the noise level, it is easy to detect the signal and misperceptions are rare. However, as the signal power decreases (or the noise level increases), detection errors start to appear. Progressively more advanced forms of the diagram are shown in Figures 3, 4, and 5.

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICshannon-fig2.gif [some corruption]

Exercise 2 will help the student apply Shannon and Weaver's analysis to the amazing variety of real world communication.

Exercise 2 - The Idealised Communication System

1 Draw up a summary table with columns headed source, transmitter, channel, signal, receiver, and destination for the following specimen communication systems:

Apache smoke signals; jungle drums; the heliograph; the carrier pigeon; a book; the shipwrecked mariner's message in a bottle; the snarl of an angry dog; pheromone attraction in insects; a raised eyebrow

2 Now consider how far apart the sender and the recipient can be allowed to be. Which communication system has the longest range? Which the shortest?

In their original description of the idealised communication system, Shannon and Weaver (1949) were careful to point out that transmission channels could only ever transmit simple physical codes. The pragmatics and semantics of the exchange (that is to say, its "meaning") were taken out of the equation prior to transmission in the sender of the message, and put back in again after reception in the receiver of the message. These are the processes of "encoding" and "decoding" respectively, and it goes without saying that if the latter does not properly mirror the former then misunderstandings are certain to occur, even if the receiver had managed to pick the message up "loud and clear".

Now encoding and decoding happen to be the telecommunications equivalent of the problems psycholinguists have long had with the stages and the substages of speech production and perception (respectively), and the problem with the basic Shannonian analysis is that it has considerably more to say about the telecommunications than it does about the psychology. Effectively, the model proposes a speaker consisting only of a mind (the source) and a mouth (the transmitter), and a listener consisting only of ears (the receiver) and a mind (the destination). It therefore totally fails to reflect the many intermediate cognitive processing stages.

ASIDE: Cognitive models started to acquire intermediate processing stages as long ago as 1870s. The Kussmaul (1878) model of the language centres is indicative of how language processing stages were seen at that time, and is a good example of a genre which did not reach maturity until a full century later - see our paper on Transcoding Models in Psycholinguistics for details, if interested. However, what is far from established is how many stages, and what precisely is passed between them.

Figure 3 now expands on Figure 2 to show how at least some of this intermediate processing can be represented.

Figure 3 - The Idealised Communication System (Three-Layered): This is the classical Shannonian communication channel, but now showing a rudimentary processing hierarchy within the minds at either end of the transmission channel. Note the role played by sequential processes at both the sending and receiving nodes. The communicating minds are still highlighted in yellow, and the physical channel is still highlighted in light brown, but these modules are now separated by syntactic processes. This figure thus accords (a) with psycholinguistic staged processing models (see main argument), (b) with Requin, Riehle, and Seal's (1988) view that three processing levels is the norm for both structural and functional models of motor behaviour, (c) with the concept of "layers of control" which has recently become a major design characteristic within robotics (eg. Brooks, 1991), and (d) with what is known of the phylogenetic development of cognitive systems (eg. Donald, 1991; Mithen, 1996; Deacon, 1997; Smith and Stringer, 1997). Note the constant counterflow of acknowledgements from the receiver to the transmitter. To transmit reliably, in other words, is to be constantly receiving, and this - together with the "software" required to make it happen - is what makes communication such an overhead. An even more sophisticated seven-layered analysis of this processing is shown at Figure 4.

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICshannon-fig3.gif

3 - The International Standards Organisation's Idealised Communication System

When trying to establish communication between two hierarchically layered processing systems, it is important that the sending and receiving hierarchies have been precisely co-engineered, because otherwise the receiver will be unable properly to decode what the sender has encoded. Sadly, the early decades of the telecommunications industry were plagued by precisely such problems, because different equipment manufacturers liked to protect their intellectual property by adopting marginally different standards. In fact, telecommunications was not an "open" marketplace until, as a result of consumer pressure, internationally accepted codes of practice known as "network protocols" were gradually developed. These are standard methods of (a) authenticating which stations should actually be in a network, (b) identifying which station a message is to be routed to, (c) governing when, and at what speeds, stations are allowed to transmit and receive, (d) arranging for data compression and decompression, and (e) detecting errors in transmission and arranging for their correction. In short, protocols are nothing less than vital for effective dialogue to take place.

There are many highly technical network protocols now in force, published by such agencies as the International Standards Organisation, but the one we are interested in here is the Open Systems Interconnection (OSI) Reference Model (Zimmerman, 1980). This is a reasonably non-technical overriding guideline, and it recommends a seven-layered analysis for any given communication, with the physical channel (the wire, etc.) consigned to the lowest level, and each layer communicating logically (but not physically) with the matching decoding layer(s) at other node(s). This type of arrangement is known as "peer-to-peer communication" (Gandoff, 1990).

ASIDE: Purser (1987, p207) summarises the peer-to-peer principle thus: "There are lower layers which provide some basic service which is used by a higher layer. The higher layer enhances the basic service, by providing extra functions; so that it, in turn, offers a more comprehensive service to some still higher layer." Halsall (1992, p14) adds: "Conceptually, therefore, each layer communicates with a similar peer layer in a remote system according to a defined protocol".

The OSI model thus allows information to pass smoothly and progressively down one hierarchy and just as effortlessly up the next, managing the entire process, and supporting appropriate error detection and recovery processes should difficulties arise. It helps create what Cherry (1957) termed the "cooperative link" (p16). The seven layers of the OSI model are now described:

Layer 7 - Application Layer: This is the "end-user" level of communication. It is the highest level of all, the level of pragmatic exchange between minds. It is the point of origin of the message intended to be communicated by person A, and, in due course, the point of final arrival of the message as interpreted by person B. A receiving station's Application Layer can explicitly confirm its understanding of a transmission at the pragmatic level. It does this by waiting until the channel has been turned around (see below), and by then constructing a coherent - albeit possibly negative - pragmatic reply. This is what happens in normal human conversation with responses such as "I fully agree" (positive) or "I think you're wrong on that" (negative).

Layer 6 - Presentation Layer: This is where the Layer 7 message begins to lose its pragmatic element. It is the stage where surface syntactic structure is created in outgoing messages and interpreted in incoming ones. It is accordingly situated immediately after pragmatic processing in the sender and immediately before it in the recipient. In computer networks, this is the level at which data encryption and compression take place.

Layer 5 - Session Layer: This is a coordinating layer. It sets up, manages, and terminates when necessary, the lower layers of the communication link. In so doing, it identifies and authenticates the recipients and controls the passing of Layer 6 information downwards in the sender and upwards in the recipient. It also synchronises the activities of transmitting and receiving so that stations do not end up all talking at once.

Layer 4 - Transport Layer: This is where the Layer 5-7 information is translated into a format compatible with the physical link. This includes much error checking and peer-to-peer transmission acknowledgement. It also begins the process of dividing up the message into smaller units known as "packets". At the receiving node, the Transport Layer can also dramatically improve throughput by asking the transmitting node to adjust its transmitting speed to match its own currently supportable receiving speed. In short, it manages the communications session once a path has been established. Note that the sending and receiving Transport Layers therefore have to carry out their own "technical" conversation. This goes on "underneath" the main conversation, and so risks interrupting it now and then.

ASIDE: It follows that unless the Transport Layer interaction can be given its own, separate, physical channel - a dedicated backchannel - these interruptions risk becoming more and more visible to the respective application layers. Human communication frequently uses facial expression and gesture to exchange its Transport Layer messages. A receiving station's Transport Layer has the task of concatenating incoming messages back from their transmission packets into semantically processable units such as words and phrases. In so doing, it can detect errors which were not apparent to the lower layers, whereupon it has to ask the transmitting station to backtrack and retransmit.

Layer 3 - Network Layer: This is where the optimal transmission path is decided. (This layer is only needed in large networks where there are optional routes between nodes.)

Layer 2 - Data Link Layer: This is where the information is formed up into transmittable signal strings prior to transmission, and reformed from the received signal strings upon reception. At receiving stations it is where the bulk of "loud and clear" detection checking is carried out, and at transmitting stations it is where "what was that" messages from the receiver are handled and retransmission arranged as and when circumstances demand. A receiving station's Data Link Layer can counterflow a "stop sending" signal whenever it becomes overloaded, or a "please repeat" request if it failed to pick up the signal cleanly.

Layer 1 - Physical Layer: This is the transmission link itself. It transmits the signals in the pre-ordained format, but has no knowledge of their structure or significance.

There are therefore six successive encoding processes (Layer 7 to Layer 6, 6 to 5, 5 to 4, etc.) and six successive decoding processes (Layer 1 to Layer 2, 2 to 3, 3 to 4, etc.). Moreover, as soon as the message has successfully arrived, the recipient gets the opportunity to phrase a conversational reply, whereupon the roles of transmitter and receiver are reversed. Ideally, Layers 6 to 1 should be totally "transparent" - they should operate automatically and unconsciously, so as to allow the two Application Layers to talk as though directly to each other. It is also common telecoms practice (a) to provide for signal boosting at intermediate "repeater" stations, and (b) to route messages flexibly around the network according to where there happens to be spare capacity at the moment in question. To keep costs down, only lower layer line management functionality is provided at these repeater stations and network nodes. This constant interplay of processing is shown diagrammatically in Figure 4, and Figure 5 shows how the OSI Model copes when the transmission channel is further analysed to show feedback separately.

Figure 4 - The Idealised Communication System (Seven-Layered) - The "OSI Model": This development of Figure 3 recognises seven output layers at Node A, and seven input layers at Node B. The top layer - the Application Layer - is concerned with the true pragmatic meaning of the messages received and sent (for sake of consistency with Figures 2 and 3, we have continued to highlight this in yellow). The next layer - the Presentation Layer - either prepares outgoing messages for sending or incoming messages for higher processing. The Session Layer establishes and manages the communication "session" between participating nodes and performs authentication and some error recovery functions. The Transport Layer carries out further error detection and begins the process of dividing up the message into smaller units known as "packets". The Network Layer then manages the channel, the Data Link Layer forms the information into character strings of the length required by the physical link itself, and the Physical Layer (highlighted in light brown) is that physical link (although this can actually take a variety of forms - see text). Coding introduced at a given transmitting layer is not decoded until the message reaches the equivalent receiving layer, and any control characters and markers added to the substantive message at a given transmitting layer are not removed until the message reaches the equivalent receiving layer. As previously noted, this arrangement is known as "peer-to-peer" communication.

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICshannon-fig4.gif

Figure 5 - Two Node Flow and Counterflow in the Idealised Communication System: Here is an expansion of Figure 4, to show the flow and counterflow of information in a two-node communication. Node A is currently transmitting, so the flow of pragmatic information proceeds from left to right along the lower link. It is facilitated, however, by the line management counterflow from right to left along the upper link. This counterflow can be initiated from any Node B layer (although layers 2 and 4 are particularly active in this respect). With most face-to-face human conversations, the primary transmission uses the vocal-auditory pathway and the counterflow uses the kinesic-visual pathway (ie. it relies upon facial expression and body language). In turn, this causes major qualitative and quantitative differences in the information being transmitted. The primary transmission has a large symbol repertoire to choose from, so that each transmitted chunk conveys a lot of information. The counterflow, on the other hand, has a lot less to say but must say it very precisely. Its symbol repertoire is a matter of yes-no, stop-start, faster-slower, did-didn't, etc. When sending node processing is challenged in some way by counterflow, the sending peer must put its higher layers on hold until the problem has cleared, and it achieves this by issuing what is known as an "interrupt" upwards within its own processing hierarchy, ultimately and if necessary all the way up to layer 7. A more detailed worked example is given in Figure 8.

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICshannon-fig5.gif

Figure 5 is intended to reflect the internals of face-to-face biological communication, but is (upon consideration, necessarily) visually similar to the sort of diagrams drawn by engineers to explain the flow of information in non-biological telecommunication. Compare, for example, the work of Gorry Fairhurst at the University of Aberdeen - click here.

4 - Related Concepts - Bandwidth and Information Redundancy

The beauty of the classical Shannonian analysis is that it can be used to interpret all and any real-life instance of communication, without exception. Thus the communication system might be semaphore, where the signal is the light reflected from a moving flag, or it might be Braille, in which the signals are bumps on paper and the receiver is the reader's fingertip, or whatever. Moreover, you can communicate in a variety of languages, using the transmission system of Morse Code, and in so doing you can use a variety of transmission channels, namely wire (the telegraph), radio waves, light waves (flashing your headlights), sound waves (banging on the pipes), or whatever. However, these basic elements are only the beginning, and if we are to do Communication Theory fuller justice we need to look at two of its more advanced concepts, namely bandwidth and redundancy .....

Bandwidth is a measure of a communication system's ability to send more than one message simultaneously. It is how "wide" that system's transmission channel is. The term originated with radio, but the principle is easier to illustrate if optic fibre transmission is considered: someone transmitting pulses of green light, for example, would not interfere with someone transmitting in red, etc, even if they transmitted at precisely the same instant. All that is needed is some way of filtering the competing messages apart at the receiving end. Again, the technical concept can be transported to most other forms of transmission link: the jungle drums could send a bass and treble beat (more or less) simultaneously, the mariner might put two letters into his bottle, etc., etc.

Using this extended set of concepts, Shannon and Weaver were then able to specify the theoretical maximum channel capacity (CCmax) of their idealised communication system. The equation is as follows:

CCmax = WTlog(1 + P/N)

Where W is the bandwidth, T is the transmission time, P is the signal strength, and N is the noise strength.

Redundant information is information which does not actually add anything to the meaning of a message (examples follow). It involves duplications of, or within, a message, and helps thereby to protect that message from accidental corruption. It is sending more than you strictly need to send, in order to guarantee that what you do need does get through. The redundancy (R) of a system is conventionally expressed as a percentage, and is derived by the following equation:

R = (CCspare/CCmax) x 100

Note from this equation that if the channel is working close to its theoretical limit, CCspare will accordingly be small, and this means in turn that R will also be small. Conversely, if the channel is working well below its theoretical limit, both CCspare and R will be large. Written language is highly redundant, for example, because it uses more letters than it needs to. Spoken language similarly, because it uses more words, and more simultaneous frequencies in transmitting those words, than it needs to. The phenomenon is actually quite easy to demonstrate. In one early study, Huey (1908) compared reading speeds for a variety of in-some-way-incomplete forms of text. He found that the top half of a line of print conveyed more information than the bottom half, and that the left-most fragment of each word conveyed more information than the right. This is shown in Figure 6.

Figure 6 - Redundancy in Text: Here is the first verse of the nursery rhyme "Mary Had a Little Lamb". Approximately 25% of the first line has been masked off from below, yet it remains easier to decipher than the second line which has a comparable amount of masking from above. The missing parts of lines which can nevertheless be immediately understood are, by definition, initially unnecessary - they are "redundant". The third line has had each word masked off from the right, and is appreciably harder to make sense of. The fourth line is effectively illegible, save that in the context of the earlier lines it can be guessed at (which means, strictly speaking, that the entire fourth line was redundant anyway).

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICshannon-fig6.gif

Much the same pattern is found with auditory stimuli, where unnecessary frequencies can be artificially filtered out without any immediate loss of intelligibility. Wood (1955), for example, claims only a 2% decrease in the intelligibility of speech despite filtering out all speech sounds below 500 Hz (and despite the fact that this represented an energy loss of 60%). Indeed, the deterioration is only 35% if all sounds below or above 1500Hz are filtered out (energy losses of 90% and 10% respectively). Redundancy is a measure of predictability, therefore. It is a means of making it "increasingly difficult to make an undetectable mistake" (Cherry, 1957:185). And why does all this matter? Because the brain is a very noisy system, and human communication is highly redundant.

Exercise 3 will help the student apply the redundancy principle to the real world.

Exercise 3 - The Idealised Communication System

1 Consider the specimen communication systems analysed in Exercise 2, and ask yourself what would happen if the recipient failed to understand the message. Which communication systems allow immediate feedback of this problem, and which do not?

2 Replicate Huey's (1908) study for "top half" text. Use a variety of stimulus sets, varying the amount of vertical masking from 25% to 75% in 5% intervals. Plot (a) accuracy and (b) reading speed against percentage masking, and determine the percentage at which there is the greatest rate of deterioration of performance (ie. the point of greatest slope).

3 Repeat (2) but using "bottom half" text.

4 Repeat (2) but using masking from top and bottom simultaneously.

5 Compare results (2), (3), and (4).

5 - Communication Networks

To apply the idealised two-party communication system to more than two communicators, you simply have to create a channel between each node. This gives you a communication network, and communication networks, of course, are what you have at the macro level in social groups and organisations and at the micro level in modular processing systems (robotic and biological). As such, their formal study dates from Hartley's era rather than from Shannon's, having been introduced in the 1930s - as "sociograms" - by such workers as Moreno (1934) and Lewin and Lippitt (1938), and having then been further developed by Bavelas (1948). What makes networks troublesome, however, is the fact that it is possible to link up the participating nodes in different ways, as now shown in Figure 7.

Figure 7 - Communication Networks: Here are some specimen networks of different levels of complexity. Each circle (highlighted yellow) represents a communication node, and each straight line represents a "station-to-station" Shannonian channel. Each node contains a finite amount of processing capacity, but communication is a considerable overhead. As a result, part of a node's overall functionality has to be dedicated to nothing more productive than managing the links with other nodes. This line management overhead is shown in black, and can seriously erode the node's ability to perform its primary role. It follows that allowing two-way communication between all the parties in a network is rarely the right thing to do, because the required number of communication channels increases ever more rapidly with the number of nodes in the network. Thus .....

If there are two parties to a communication you need only one channel.

If there are three parties to a communication you need three channels (ie. you add two).

If there are four parties to a communication you need six channels (ie. you add three).

If there are five parties to a communication you need ten channels (ie. you add four).

If there are six parties to a communication you need 15 channels (ie. you add five).

If there are seven parties to a communication you need 21 channels (ie. you add six).

In the two-node network (top left), there is - because there is only one other node available to talk to - only one line management process per node. However, in the four-node network (top right) there are three, in the six-node network (bottom left) there are five, and in a ten-node network (not shown) there would be nine! It follows that nodes should communicate directly to other nodes only when they positively have to. The rest of the time they should either not bother communicating at all, or else should take an indirect route via an intermediate node or two. As we have pointed out elsewhere (Smith, 1991), the same considerations operate to restrict "overmodularisation" of distributed processing systems such as the Central Nervous System. In the six-node networks shown, this expedient allows the overhead to be reduced from five line management processes per node in the fully interconnected system (bottom left) to two per node in the not-fully-interconnected system (bottom right), with consequent improvement in the relative amount of intelligence available for productive work.

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICshannon-fig7.gif

The practical impact of the unrelenting technical demand for information counterflow is that in networks of any size it is simply not a good idea for everyone to be free to talk to everyone else. Despite the fact that "without communication there could be no organisation" (O'Shaughnessy, 1976:209), communication must be strictly "rationed". Indeed, it was presumably in humankind's attempts to work its way around this very contradiction that such things as division of labour, job descriptions, and (more recently) hierarchical organisational structure charts were invented. Divisions of labour allow every node to have a precisely defined job to do, so that it only needs to communicate when it has something specific to say (and that communication will probably be with another precisely defined node). Curiously enough, the mathematical analysis of sociograms has advanced all too slowly since Bavelas's time. For example, O'Shaughnessy (1976) was still using simple unquantified node-and-arrow diagrams while discussing how best to design communication networks for maximum net flow of information.

ASIDE: In fact, honourable mention is due to communications theorist David K. Berlo for his S-M-C-R Model (Berlo, 1960). Berlo has managed, more or less single-handedly, to keep the Shannonian tradition alive in the world of applied psychology.

And why does all this matter to psychologists? Because brains are modular communication networks, and so, too, are the groups of people in whose skulls they reside. In short, because the laws of efficient network communication are also the laws of efficient cognition.

6 - The OSI Model and Modern Psycholinguistics

In Figure 5, we introduced the term "pragmatic flow" to describe the delivery of one party's fully reasoned thoughts during a conversation. We derived this term from "pragmatics", the linguistic science of communicative intent, an area of research which grew from seminal works in the 1960s by the linguistic philosophers John Langshaw Austin (1911-1960) and John Searle (1932-). Austin (1962) and (his student) Searle (1969) devised a valuable new method of communication analysis, based upon their concept of individual competency at the level of the "speech act". They saw speech acts not just as the words which people happen to use, but as units of intentional achievement, and the formalised study of the communication of deep intent has since grown to be a major subscience of both linguistics and cognitive psychology. Figure 8 shows how the psychology, the linguistics, and the communication theory can be put to work simultaneously.

Figure 8 - A Seven-Layered Five-Node Communication Network: Here we use the OSI architectural concepts to trace the flow and counterflow of information in a five-node communication network. Each node represents a person, and each person's mind is shown as a seven-layered cognitive system. The processes which therefore need to be taken into account are A1 to A7, B1 to B7, C1 to C7, D1 to D7, and E1 to E7, and the challenge is to arrange for the five nodes to communicate in "conversational" fashion, that is to say, for them to take part in a mutual exchange of ideas, together with all the turn-taking, acknowledging, questioning, and "repairing" which this naturally involves. The diagram shows a frozen moment in time when Node A has just spoken the sentence shown (in fact, one of the deliberately ambiguous sentences much loved by psycholinguistics lecturers). Nodes B to E have been listening, but with different degrees of success. Node B has understood the utterance but not its deixis and is responding with a request for clarification, Node C has understood the words but cannot make sense of the sentence and is responding with a request for clarification of a different sort, Node D has misheard the sentence because it was somewhat gabbled and is responding with a request for a slower retransmission, and Node E has misheard the third word and is responding with a request for a word retransmission. The four suggested replies are shown. What matters, note, is that these replies are coming from different layers of the cognitive hierarchies in question (highlighted in blue), specifically Layers B7, C6, D4, and E2, respectively. Pragmatics, in other words, while it appears at first sight as a set of entirely "high-level" mental functions, emerges upon inspection as a processing hierarchy, and the implications of this are (a) that the repertoire of speech acts must be similarly layered, and (b) that so too must volition itself be.

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICshannon-fig8.gif

We are not suggesting, of course, that biological communication systems are organised along precisely the same lines as telecommunications systems, merely that there are sufficient similarities to warrant further investigation. In fact, Halsall (1988) makes the telecommunications-biology metaphor quite explicit from the other direction:

"The presentation layer is concerned with the representation (syntax) of the data during transfer between two correspondent application layer protocol entities. [It] thus negotiates and selects the appropriate transfer syntax(es) to be used during a transaction so that the syntax (structure) of the messages being exchanged between two application entities is maintained." (Halsall, 1988:211.)

Halsall goes on to illustrate his argument by asking us to imagine a telephone conversation between French and Spanish monolinguals via interpreters at either end. The interpreters are the ones who are actually using the telephone, and their common language is, say, English. In this example, the correspondents are the application layer entities, and the interpreters are the presentation layer entities; French and Spanish are the local syntaxes, and English the transfer syntax. The interpreters do not, it is assumed, contribute to the semantic content of the exchange. Exercise 4 will help the student explore the sort of problems which might then be encountered.

Exercise 4 - Communication Networks

1 Draw one-layer network diagrams (ie. "sociograms" or Bavelas Diagrams) of the following specimen communication networks:

Lecture; Dinner Party (eight persons); Theatrically Staged Dinner Party (eight persons plus audience); Two-Person Conversation via Interpreter; Televised Political Debate (three politicians, one interviewer, plus studio and home audiences)

2 Working to the format used in Figure 8, draw seven-layer network diagrams of the following specimen communication systems. Identify local and transfer syntaxes separately where these are different, and illustrate as many of the four different levels of conversational repair as possible.

A telephone conversation in English between monolingual English speakers; A semaphore conversation in French between a monolingual French speaker and a bilingual English speaker; A face-to-face conversation in French between a monolingual French physicist and a monolingual English physicist via a bilingual interpreter who knows little about physics.

3 Using different colour highlighting pens, mark all instances in (2) where physically separate backchannels are used for OSI L2/4 and OSI L6/7 feedback

7 - Some Closing Words on the Transmission Backchannel

This completes our brief introduction to the general relevance of the OSI concept in biological communication. We have seen that minds are frequently described as hierarchically organised modular processors, but that in their haste to get at the psychological or philosophical substantives these models frequently overlook the supporting technicalities. They say very little about the control logic which has to be in place for the substantive flows to take place. This is unfortunate, because it is here - in amongst all the biological simplex and half duplex circuitry, feedback circuits, repeater stations, and interrupts - that the true complexity of the cognitive system undoubtedly lies. Feedback, for instance, is nothing less than a theoretician's nightmare, for it may travel by a variety of routes, long and short, conscious and unconscious, direct and indirect, and possibly - using "antidromic" signalling - duplexed back up the fibre tract the original message arrived down.

Key Concept - Antidromic Transmission: The term antidromic means "flowing back towards the source", and thus the opposite of orthodromic. The term can be applied to information feedback in its many senses (especially if the feedback has been duplexed back up the wire of original transmission), but has also been applied to neuronal activity, where it denotes electrical excitation applied at the distal end of an axon and propagating "up the down escalator" towards the soma. One of the classic works here was Woolsey and Chang (1947), and for a typical modern application see Lee Campbell's work at the Salk Institute.

The need for feedback is everywhere in all communication, biological or otherwise. However, it is vitally important to distinguish between feedback from a distant station and the replies from the distant party, because whilst the former fulfils only the technical requirements for fluency, the latter are part of the ongoing conversation. It follows that feedback does not need to be relayed to the local party unless and until things start to go wrong with the line, whilst the latter always needs to be relayed. [Readers who are confused at this distinction need only to ask themselves how their telephones know what to do when the number they have called turns out to be engaged - you are "talking" to the distant station, sure enough, but not to the person you rang! Similarly, it is better to get a dialling tone as soon as your remote party hangs up on you, rather than spend the next 20 seconds talking to thin air!] Moreover, feedback works significantly better if it can come back along a different pathway to the outgoing message, meaning that an ostensibly one-way transmission will often require two physical links to be in place. Appendix A shows some of the key terms.

8 - Useful Hyperlinks

For further background on the Shannonian system as applied to modern telecommunications networks, see Tschudin (2000 online).

References

See the Master References List

[Home]

APPENDIX A - HOW MANY WIRES IS A WIRE?

This glossary was published as a stand-alone web resource in January 2000, and has been imported into this paper with minor alteration. Several important telecommunications concepts are defined. They were developed long ago in the age of wire-based telecommunications, and sensu stricto have long since been left behind by the age of satellites and computers; nevertheless, the underlying concepts retain immense illustrative value, especially for those investigating the principles of biological processing hierarchies.

Simplex Telecommunication

A single wire can be used to send information in either direction between two points, but only if both ends are rigged to act interchangeably as transmitter and receiver, and even then not simultaneously. Since interchangeability is expensive, single wire systems are usually reserved for one-way communication. This means that there can be no line management feedback, of course, and this means in turn that performance will be poor whenever transmission difficulties are encountered. Somewhat counter-intuitively, then, it is usually better to use two wires when sending information in one direction. The forward wire is used to carry the substantive information flow from the transmitting station, and the return wire is used to carry the counterflow of line management feedback from the receiving station. Here are some examples:

SPECIMEN ONE-WIRE ONE-WAY SIMPLEX SYSTEMS = simple key-to-buzzer telegraph; simple cordpull-to-bell chambermaid's telegraph, etc.

SPECIMEN ONE-WIRE TWO-WAY SIMPLEX SYSTEM = children's "cup and string" telephone system.

SPECIMEN TWO-WIRE ONE-WAY SIMPLEX SYSTEM = engine-room telegraph (this is a one-way system to the extent that orders only ever go downwards, even though the return wire allows their safe receipt to be explicitly acknowledged).

For the missing entry, the TWO-WIRE TWO-WAY system, see the half duplex panel below.

Duplexing and Multiplexing

Due to the cost of laying and maintaining their cables, the telegraph and telephone companies were under pressure from the outset to reuse their resources to the absolute limit. One of the earliest tricks of the trade - and nowadays the mainstay of the entire global telecommunications industry - was that of making one wire behave as if it were many. Here are the key terms .....

Duplex (1): In standard English, this simply means double or two-fold, as with poets who happen also to be critics, or lamps with two wicks (Oxford English Dictionary; earliest instance dated 1817). The word was then borrowed by early telecommunications - see next item.
Duplex (2): In its technical sense, duplexing means sending two messages down a single wire (Oxford English Dictionary) by making it somehow think it is two wires. Nowadays, there are many ways to do this, but they all boil down either to time sharing or some form of selectively "filterable" encoding. The facility is expensive in terms of transmitter and receiver complexity, but it doubles the number of paying subscribers on a given cable. The practice is far from new, with Thomas Edison, for example, taking out several patents for duplexing techniques as long ago as the 1870s. And do you need to stop at only two wires? No, because duplexing is merely the simplest instance of multiplexing .....
Multiplex: ..... which is where a single wire is made to think it is many wires, thus enabling the number of subscribers for a given network investment to be similarly multiplied.

Half-Duplex or Full-Duplex

We are now ready to allow conversational turn-taking into our system, by utilising our second simplex wire (above) to carry the facilitatory backchannel traffic .....

Key Concept - Half-Duplex Transmission: This is where two wires are used to set up a channel capable of sending information in both directions, but not simultaneously. As with the simplex set-up, the backchannel is used to carry the counterflow of line management data from the receiving station. Now, however, the receiving station is free in due course to take over as transmitting station, and to reply conversationally - all that needs to happen is for the lines to be "turned around", that is to say, for both transmission directions to be reversed. The processing overheads under this sort of set up are heavier than for the simplex set up, due to the need for both stations to be equipped with both transmitting and receiving functionality.

SPECIMEN SYSTEMS = radio-telephone / Citizen Band (where the right to transmit alternates between users, typically by concluding each utterance with the tag "over"); e-mail system (where the system keeps you reasonably confident that your message has arrived at the distant station, but where nothing less than an explicit conversational reply can give you 100% certainty that it has eventually been attended to).

There remains one major problem - natural human communication is full of interruptions, interjections, contradictions, and generally sloppy turn-taking. Both parties need to be able to shout simultaneously, and they cannot do this on a half-duplex system. In fact, they both need half-duplex systems permanently at their disposal, which gives us, of course, a "full-duplex" set-up.

Key Concept - Full-Duplex Telecommunication: This is where four wires are used to set up a channel capable of sending information in both directions fully simultaneously. Unlike the half-duplex set-up, there is no longer any need for the lines to be turned around before the receiving station can transmit its reply. This gives full biological functionality, that is to say, communication which is characterised by the two parties being able to interrupt each other at will (except when routed via satellite, when the occasional half second delay still plays havoc with the fluency of your conversation).

SPECIMEN SYSTEM: The telephone.

For further illustration of the difference between half-duplex and full-duplex linking, see the recent IBM White Paper on switched ethernet LANs.