Lecturer's Précis - Norris (1991)
"The Constraints on
Connectionism"
Copyright Notice: This material was
written and published in Wales by Derek J. Smith (Chartered Engineer). It forms
part of a multifile e-learning resource, and subject only to acknowledging
Derek J. Smith's rights under international copyright law to be identified as
author may be freely downloaded and printed off in single complete copies
solely for the purposes of private study and/or review. Commercial exploitation
rights are reserved. The remote hyperlinks have been selected for the academic
appropriacy of their contents; they were free of offensive and litigious
content when selected, and will be periodically checked to have remained so. Copyright © 2004-2018, Derek J. Smith.
|
First published online 14:40 GMT 17th March 2004,
Copyright Derek J. Smith (Chartered Engineer). This version [2.0 - copyright] 09:00 9th July 2018.
Readers unfamiliar with the concepts of "connectionism" and "neural network" should pre-read our e-handout on "Connectionism". |
1 -
Introduction
Norris begins by summarising the value of connectionism to psychologists as follows .....
"The
recent revival of connectionism has probably created more excitement than any
other development in the history of cognitive psychology. Part of this
excitement is due to the fact that connectionist models naturally express a
number of characteristics which seem to typify human cognition. For example,
graceful degradation of behaviour following damage, content addressable memory,
and pattern completion are all cited as natural properties of connectionist
networks [.....]. The other source of the appeal of connectionism is that, like
the brain, connectionist networks are built from large numbers of highly
interconnected simple processing units. Therefore connectionism looks like a
good starting point for building brain-like models of cognition." (p293.)
However, connectionism's allure can sometimes be misleading, because connectionist nets cannot always cope. Some problems leave them floundering, and the idea that "you simply train your network" (p293) and then inspect the end result to learn how biological brains have been doing the task all along is flawed to the extent that "connectionist learning algorithms aren't really that smart" (ibid.). The most serious weakness emerges whenever the problem at hand needs to be solved in discrete steps, as now discussed.
The specific problem Norris gave his network was how, given any date in a century, to reply with what day of the week it was. It is worth familiarising oneself with what this task involves before proceeding .....
Exercise - What Day of the Week? (1) Familiarise yourself
with this task. What day of the week was/will be ..... seven days ago ten days' time 49 days ago 48 days ago 7000 days' time 6999 days' time 11th September 2001 18th March 2042 29th February 2080 |
Norris began by taking the classical connectionist approach. This involved trial-and-error training of a single network with a series of actual date-day pairs, thus .....
"We
started by training a network with a single layer of hidden units on 20 per
cent of the dates in a 50-year period. All it learned was the dates we trained
it on! The network's ability to generalise beyond the dates it was trained on
was negligible." (p295.)
So Norris then turned to the problem itself, and to cut a long story short, it turned out that the best solution was to treat the conversion in three separate steps, as follows .....
Preparatory: Select a single
"base month", in which all date-day conversions are known.
Step #1: Given the target
date, ignore the month and year digits, and look up the corresponding day
digits in the base month.
Step #2: Now compute and
apply an "offset" to account for the difference (if any) between the
target month and the base month.
Step #3: Now compute and
apply another "offset" to account for the difference (if any) between
the target year and the base year.
Norris then took three separate neural networks, allocating one to each element of the superordinate task. Each network was then separately trained up on what it alone had to learn [thus the Step #1 network was trained up on the Step #1 conversion, and so on]. Only when this had been done were they allowed to communicate one to another, whereupon "the whole net performs at about 90 per cent correct, about as good as the best date calculators in the literature" (p295). The secret, in other words, is to break the problem down into its logical components. "Instead of having to solve one large problem," Norris writes, the connectionist network was much cleverer when it "simply had to solve three far smaller problems" (ibid.). Here is Norris' critical observation .....
"We
should not be surprised to find that connectionism offers no easy solutions. [because] the only way we can build interesting connectionist
models is by first understanding the structure of complex tasks like language
understanding or face recognition. Connectionism will not furnish that
understanding for us." (p296; bold emphasis added.)
2 - Evaluation
Norris was one of the first to demonstrate that there exist tasks for which you need separate neural networks, which, moreover, you need to connect up in a very precise fashion, and train in a very precise sequence. And what this gives you, of course, is a network of neural networks, in which the connections within AND BETWEEN each module are both vitally important. Which, for all the technology at its disposal, suddenly placed connectionism conceptually back with the nineteenth century diagram makers. As Norris himself concludes, the modularity of the cognitive system is going to have to be deciphered in the first instance by humans, not by machines .....
"If
we knew how to knit together a language processing network in the way I have
done for the date-calculation task we would already know most of the answers to
the really difficult theoretical problems in psycholinguistics. We would have
understood the algorithms and we would know how they fitted together. Connectionism
will just provide the theoretical tools for building the model and testing it
out." (p296.)