eckert-von-neumann

Course Handout - The "General Purpose Computer", circa 1950

Copyright Notice: This material was written and published in Wales by Derek J. Smith (Chartered Engineer). It forms part of a multifile e-learning resource, and subject only to acknowledging Derek J. Smith's rights under international copyright law to be identified as author may be freely downloaded and printed off in single complete copies solely for the purposes of private study and/or review. Commercial exploitation rights are reserved. The remote hyperlinks have been selected for the academic appropriacy of their contents; they were free of offensive and litigious content when selected, and will be periodically checked to have remained so. Copyright © 2003-2018, Derek J. Smith.

First published online 13:12 BST 7th May 2003, Copyright Derek J. Smith (Chartered Engineer). This version [2.0 - copyright] 09:00 BST 5th July 2018.

Although this paper is reasonably self-contained, it is best read as a one-part subordinate file to Part 3 of our six-part review of how successfully the psychological study of biological short-term memory (STM) has incorporated the full range of concepts and metaphors available to it from the computing industry. To go directly to the superordinate content file, click here, to go to the superordinate menu file, click here, and to see the author's homepage, click here.

1 - Introduction

For a computer to qualify as a "General Purpose Computer" (GPC) (or "Eckert-von Neumann machine"), it has to have the following qualities:

It needs to be electronic not electromechanical.
It needs to be digital not analog.
It needs to work in binary arithmetic, not decimal.
It needs a particular macro-architecture, namely that first proposed for cogwheeled mechanical calculators by Charles Babbage in the 1820s, and consisting of separate modules for (1) overall control, (2) calculation, (3) memory, and (4) communication with the outside world, all interconnected by a wiring loom.
It needs to run binary stored programs from memory, accessing binary data in that same memory.

This combination of features took approximately 120 years in gestation, beginning with Charles Babbage's modular Difference Engine (see Part 1) and culminating in the breakthrough technical vision of Princeton University's John von Neumann in the closing years of the Second World War [up]. This conceptual vision was then explained in detail at the 1946 Moore School summer school [up], further refined on the slow and troubled EDVAC development, and actually made to work on the Eckert-Mauchley BINAC project [up], the Manchester University Mark 1 [up], the NPL Pilot ACE [up], the Princeton IAS machine [up], and the Cambridge EDSAC [up] (and it was EDSAC's Maurice Wilkes who - a half century later - assessed John Presper Eckert as the single most influential of the "hands-on" engineers, and who accordingly suggested the epithet "Eckert-von Neumann machine"). These successive developments shaped the computer as we know it today, and gave us much of our modern computer jargon. In the sections which follow, we look in greater detail at the various components of von Neumann's vision and Eckert's engineering, and show the typical architecture of a "generation zero" computer from the year 1950.

2 - The Central Processing Unit (CPU)

The CPU is where the GPC "thinks". Specifically, it contains a "Control Unit (CU)" to administer the sequencing of its thoughts (Babbage's Module #1), and an "Arithmetic/Logic Unit (ALU)" to act upon those thoughts in some appropriate way (Babbage's Module #2). Each step within the overall control sequence is called a "machine instruction", and a logically complete and coherent sequence of machine instructions constitutes a "program".

Key Concept - Machine Instruction: Machine instructions tell the Control Unit what to do, and they consist (always) of a relatively short binary code known as an "op code", usually (but not invariably) followed by one or more binary codes to be used in the resulting operation. The additional codes are called "operands", and may either (a) fully specify a data value (a number or a letter-string, say) in an absolute sense, or else (b) state the "address" where that data value may be found elsewhere in the machine. The fully specified data values are known as "immediate" values, and the others as "variables". The full repertoire of instructions available to a given machine is known as its "instruction set" [we reproduced the entire instruction set for the EDSAC in Part 3, in the section entitled The Cambridge University EDSAC].

Example: An instruction of the form <ADD 4,B> is asking the machine to add the immediate value 4 to the number currently stored in address B (whatever that should happen to be at that instant), whereas <ADD A,B> is asking the machine to add the number currently stored in address A (whatever that should happen to be at that instant) to the number currently stored in address B (whatever that should happen to be at that instant). The <ADD> part of the instruction is the op code, and the associated values and/or address(es) are the operands - the things operated on.

The number of operands associated with a given op code is logically determined by the commonsense nature of the instruction concerned. Thus the instruction for <STOP RUN> requires no operands at all, whilst that for <ADD A TO B, AND PUT THE RESULT IN C> requires three. The number of operands is also physically limited by the basic wiring of the CPU, and as a result the general rule here is that the fewer the better. An important trade-off therefore has to take place during the circuit design stage, between the size of the instruction set and the complexity of the individual instructions, thus in a three-operand design, the three operands can be handled by a single instruction, in a two-operand system, they can be handled by two instructions, and in a single-operand system they need to be handled by three instructions.

Most generation zero machines were single-operand systems, and those which were not (notably the EDVAC and the Pilot ACE) owed many of their technical difficulties to biting off more than they could chew. For technical reasons, two-operand systems turned out to offer little practical advantage over one-operand systems, so the usual choice was between one- and three-operand architectures, the syntaxes for which are respectively as follows .....

instruction = <opcode> [<operand1>]

instruction = <opcode> [<operand1> [<operand2> [<operand3>]]]

Key Concept - Mnemonic: In the above examples, the English words STOP RUN and ADD are used as "mnemonics", that is to say, they are memory aids intended for human consumption only. The corresponding op codes remain bit sequences as previously stated. Software products offering to convert mnemonics into binary for you were developed in the early 1950s, are known generically as "Assemblers", and are explained in greater detail in Part 4 and Part 5.

Key Concept - The "Word": The unit of information storage and transport within a GPC varies from design to design, but is typically in the range 16 to 64 bits, and is known as a "word". As a general rule, the combined length of the op code plus all the associated operands should not exceed that word length.

Key Concept - Instruction Length: The number of operands is not the only factor affecting instruction length, because the length of both the op code element and the operand(s) will vary with the intended capacity of the machine being proposed. We deal with these two factors separately:

Op Code Length: The number of bits allocated to the op code field varies with the number of different operations the machine's designers wish to offer. For example, a two-bit op code would support only a four-item instruction set (because the only codes available would be 00, 01, 10, and 11). If the op code was extended to three bits, then this would support eight instructions (because the codes then available would be 000, 001, 010, 011, 100, 101, 110, and 111). Similarly, a four-bit op code would support 16 instructions, five bits 32, and so on, on a "powers of two" basis. To put this issue into perspective, Wilkes (EDSAC Project) estimated (a) that "less than six" different single-address "order codes" were required to do all forms of mathematics, and (b) that an instruction set of 16 to 32 items was enough for "an entirely practical instrument" (Wilkes, 1956, p270).

Operand Length: The number of bits allocated to an operand varies with the size of the available Main Memory. This is because the usual purpose of an operand is to specify a storage location within said memory. If eight bits are allocated to the data address field, then this only supports 256 different addresses (two to the power eight), and if one 32-bit data word is available at each address, this restricts usable Main Memory capacity to 8192 bits (ie. 256 x 32). This is an absolute limitation, because even if you installed a lot more memory you would be unable to address what was in it! Hence if designers require more memory they need more bits in their address operands in order to address it.

ASIDE: In practice, word and operand lengths varied quite considerably across the early development teams. Ed Thelen's website provides a number of handy comparison tables showing the word and instruction length for most of the classic computers [click here]. By the 1970s, most GPC instruction sets contained 100-200 items, having grown with the sophistication of the technology. However, in true Pareto fashion, 80% of the processing was being done by 20% of the instructions, so a market gradually developed for specialist high-performance systems based upon "reduced" instruction sets. The result was "Reduced Instruction Set Computing (RISC)" [details and discussion]. Here are some specimen one- and three-operand instruction lengths:

Manchester Mark 1: 20-bit single-operand instruction; 26 instructions; 40-bit word.

EDSAC: 18-bit single-operand instruction; 18 instructions; 36-bit word.

BINAC: 14-bit single-operand instruction; ? instructions; 31-bit word.

Whirlwind: 16-bit single-operand instruction; ? instructions; 16-bit word.

As each program instruction falls due to be acted upon, it is moved from the program area in the Main Memory Unit (see next section) and stored momentarily in the "Instruction Register (IR)", a dedicated area of electronic memory within the Control Unit.

Key Concept - Registers: Registers are small, fixed purpose, high-speed memory stores, wired integrally into any one of the major system modules. Like the Instruction Register, registers serve a number of highly specific purposes. As detailed in Section 4 below, the "Program Counter", for example, contains the address of the next instruction to be executed, and is accordingly the first place to look if anything goes wrong .....

ASIDE: If a program fails, it is useful to "dump" the contents of memory onto a print-out to help "debug" the program. The Program Counter is always high up on the list of things to look at, because it shows exactly where the error occurred. Dumps also list the contents of all the system registers, so that their contents can also be checked to be appropriate.

Unlike the Instruction Register and the Program Counter, the machine's "General Registers" will contain the data words input to, or produced by, the current machine instruction. As such, they date back conceptually to the digit wheels on the eighteenth century calculating machines). There are typically half a dozen general registers (the numbers grew as computers got bigger during the 1950s) and they are usually identified by the letters A, B, C, etc. [our earlier example <ADD A,B> was manipulating registers of this type]. The "Accumulator" is a register used by one- and two-operand systems to carry partial results from one instruction to the next. Most registers will contain either (a) an instruction or a data word, or (b) the address of an instruction or data word. This is an important distinction, so here is an example of each:

Examples:

o The Instruction Register contains an instruction. [It therefore begins with an op code selected from the machine's instruction set, and this op code renders said instruction meaningful to the Control Unit as one step within the overall control flow.]

o The Program Counter contains the address of an instruction. [Other than by coincidence, this does not begin with an op code, and is only meaningful to the Control Unit insofar as it identifies a memory location somewhere in the Program Area.]

o General Registers contain data words.

o Index Registers (a type of register which points to the currently active data item within an array of like data items) contain the addresses of data words.

Nowadays there can be several dozen registers within the CPU alone. The precise number of registers is decided by the hardware designers, and for maximum efficiency the functionality provided by each needs to dovetail very precisely with the machine's instruction set, which needs to dovetail, in turn, with the machine's various logic circuits (see below). The registers discussed in the remainder of this paper should be regarded as only the minimum requirement. The basic physical components of registers are bistable single-bit memory units called "flip-flops".

Both the Control Unit and the Arithmetic/Logic Unit are decision making units. This means they have been designed and built as "logic circuits", circuits whose task in life is to manipulate one string of binary impulses according to the content of another string of binary impulses. Binary input is thus "processed" in a predetermined and orderly fashion to produce binary output. In the Control Unit, this binary input is the op code, and the corresponding binary outputs are the electronic signals necessary to activate the Arithmetic/Logic Unit. In the Arithmetic/Logic Unit, the inputs are the control signals coming out of the Control Unit, plus any relevant operands, and the output is the required arithmetical result in the required memory location. Logic circuits were originally invented by the likes of Charles Wynn-Williams, Konrad Zuse, George Stibitz, and John Atanasoff (see Part 2), and their basic physical components are called "logic gates".

Key Concept - Logic Gates: Logic gates are electronic switches capable of executing Boolean decision making, that is to say, combinatory binary symbolic logic of the form developed in the nineteenth century by George Boole and Augustus de Morgan [further details].

Lewin (1985) describes logic circuits as "combinational networks", and summarises their operating principles as follows .....

"A combinational logic circuit is one in which the output (or outputs) obtained from the circuit is solely dependent on the present state of the inputs. [] The classical objective of combinational design is to produce a circuit having the required switching characteristics but utilising the minimum number of components [] Switching problems are usually presented to the designer [] specifying the logical behaviour of the circuit. From this specification a mathematical statement of the problem can be formulated [and] simplified where possible. These simplified equations may then be directly related to a hardware diagram ....." (Lewin, 1985, pp53-54; emphasis original.)

The upshot is that when the Arithmetic/Logic Unit receives control signals from the Control Unit it acts upon them in some very basic way as it has been wired up to do. These actions may involve setting individual bits or bit patterns, adding binary word to binary word, shunting a bit pattern to the left or right along its register, testing the current value of individual bits or bit patterns, etc. Fortunately for the design team, there are not many of these basic manipulations, and the burden falls instead on the programming team, who have to spend considerable time stringing simple manipulations together to do more complicated things. For early textbook examples of logic circuits, see Berkeley (1949) or Wilkes (1956), for a later textbook account see Lewin (1985), and for online examples, click here (note the subtly different circuit designs for counters, accumulators, serial and parallel adders, shift registers, and product registers).

The speed of a logic circuit is usually measured in "instructions per second" (nowadays, millions of instructions per second, or "mips"), and this is determined in turn by the "clock rate" of the system .....

Key Concept - Clock Rate: The logic gates which make up a computer's logic circuits, and the flip-flops which make up its registers, are electronically "pulsed" at a very precise rate, and any resulting changes of bit pattern - the fundamental activity of data processing - take place at a particular point in this pulsing cycle. The rate is expressed in "Hertz" and abbreviated Hz. One cycle per second (cps) is one Hz, one thousand cps is one kilohertz (kHz), one million cps is one Megahertz (MHz), and one billion cps is a Gigahertz (GHz).

3 - Main (Or "Primary") Memory (Or "Store")

USEFUL QUOTATION: "The Store may be considered as the place of deposit in which the numbers and quantities given by the conditions of the question are originally placed, in which all the intermediate results are provisionally preserved, and in which at the termination all the required results are found" (Babbage, 1837).

The Main Memory Unit (MMU) is the GPC's general purpose high-speed memory resource. It may be distinguished from the memory provided by the CPU registers (a) qualitatively (because it is general purpose rather than fixed purpose), and (b) quantitatively (because there is much more of it).

Now one of the cleverest aspects of von Neumann's conceptual GPC was that the MMU should contain both instructions and data, leaving it to the control system to work out which was which [it is axiomatic within the software industry that all programmers will have unwittingly tried executing data at least once in their careers - and it fails every time]. The data itself can be (a) derived from input, (b) set up as a constant when a program is loaded, (c) created as the intermediate result of a calculation, or (d) created as output records in readiness for output. In fact, it helps to imagine the MMU as being divided into a "program area" (for the instructions), a general purpose "data area" (for the constants and intermediate results), and an "input-output area" (for the inputs after they have been read in, and for the outputs before they have been written out), but these lines of demarcation are notional rather than physical, that is to say, they are "executed in software", and apply for the duration of a given run only. Communication between the CPU and its MMU will therefore consist of an entirely logical (but at first sight apparently chaotic) series of memory accesses, during which instructions are constantly being retrieved from the program area and data words are constantly being retrieved and restored to the data and input-output areas. This scattershot jumping around soon attracted the descriptor "random access", and by the mid-1950s it had become common to refer generically to Main Memory as Random Access Memory (RAM). The term Working Memory (WM) came into use around the same time (see Part 5) to refer more exclusively to the general purpose data area characterised above. The basic physical components of RAM are the same sort of flip-flops as are used in registers, but in much greater numbers, and organised both logically and physically into large arrays. Following their perfection in 1953, these arrays were usually built up from the sort of ferrite toroid matrices invented by Wang and developed by the Whirlwind team [Up].

Here is a diagram of the CPU and MMU, showing how the components described so far fit together:

Figure 1 - The Central Processing and Main Memory Units of a GPC: This is how the CPU and MMU elements described so far are assembled. The Control Unit is shown top left, the Arithmetic/Logic Unit bottom left, and the RAM matrix - suitably subdivided - is shown to the right. The relative size of the program, record, and data areas depends upon the precise structure of the program in question. If data tables need to be processed, this can result in large data area. Only six CPU-registers - the Instruction Register (highlighted in yellow), the Program Counter (gold), three General Registers (pink), and the Accumulator (light blue) - have been shown at this stage. The machine is presumed to work on single-operand logic, and so the Instruction Register is shown as containing the <opcode> [<operand1>] of the currently executing instruction. It has already made the op code available to the Control Unit (red arrow, top centre). The Control Unit then passes the appropriate control signals to the Arithmetic/Logic Unit (red arrow, left centre), instructing the latter's logic circuits what logical operation(s) to perform. At the same time, the Arithmetic/Logic Unit needs access to the operand (pink arrow). When performing arithmetic on operands, the Arithmetic/Logic Unit needs to know whether it is dealing with an immediate value, where the relevant data value is already (ie. "immediately") available, or with the address of a data word, where the relevant value must come either from a specified register (mid-blue arrow) or direct from an address in the Main Memory Unit (light green two-way arrow). The end result can be sent (a) back into the registers (along the mid-green arrow), or (b) back to an address in the Main Memory Unit (light green two-way arrow).

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICgpc-CPU.gif

4 - The FETCH and EXECUTE Cycle

The FETCH and EXECUTE cycle is the mainspring of the entire Eckert-von Neumann architecture. It is the master control sequence for the cycle of lesser electronic operations by which a single machine instruction is executed, and as with all good cyclical processes it returns every one of its elements to a state of readiness to begin again. Figure 2 shows how the CPU and MMU modules interact cooperatively during the fetch and execute cycle. [For a more detailed worked example, click here.]

Figure 2 - The FETCH and EXECUTE Cycle of a GPC: This is how the CPU and the MMU work together to produce a functioning GPC. The stages of the FETCH and EXECUTE cycle are as now enumerated, and may be cross-referenced by number to the actions indicated on the diagram (which, note, run roughly counter-clockwise). FETCH actions are highlighted in red, and EXECUTE actions in blue. All actions are initiated by the Control Unit, and driven by the clock rate pulsing of the system. Note how the Program Counter points to the instruction after the currently executing instruction from early on in the cycle.

This is what then happens in the FETCH phase of the cycle [simply follow the numbered RED arrows around the diagram].....

1. The Control Unit notes the contents of the Program Counter, and .....

2. ..... the instruction thereby addressed is read from that address in Main Memory, written onto the bus, and stored, upon arrival, in the Instruction Register.

3. The Program Counter is incremented by one, so that it henceforth addresses the instruction one word further up in Main Memory than the one just obtained (that is to say, the address of the next instruction to be executed).

4. The op code element of the instruction is passed to the Control Unit for interpretation [should it, due to an error having occurred, fall outside the list of values allowed by the instruction set, then the Control Unit will be unable to interpret it, and processing will fail].

5. An instruction-specific set of control signals is generated by the Control Unit, and .....

Things are now ready for the EXECUTE phase of the cycle [simply follow the numbered BLUE arrows around the diagram].....

6. ..... then transmitted to the Arithmetic/Logic Unit.

7. If the op code specifies an immediate value as input, then that is copied into the Arithmetic/Logic Unit directly from the Operand #1 position in the Instruction Register.

8. Alternatively, if the op code specifies a Main Memory address as input, then the word at that address is copied into the Arithmetic/Logic Unit via the data bus.

9. Alternatively, if the op code specifies a register as input, then the word in that register is copied directly across into the Arithmetic/Logic Unit.

10. The essential bit manipulation (add, shift, or whatever) takes place, giving an arithmetical or logical result (the sum of two numbers, say, or the setting of a true/false flag bit). As this takes place, it may be necessary .....

11. ..... to store intermediate results in the Accumulator while the processing runs to end.

12. If the op code specifies a Main Memory address as output, then the result is copied into it from the Arithmetic/Logic Unit via the data bus.

13. Alternatively, if the op code specifies a register as output, then the result is copied directly across to it.

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICgpc-MMU.gif

5 - Input/Output Devices

These, as their name suggests, are the electronic brain's sensory and motor systems (Babbage's Module #3). Such devices are often referred to collectively as "peripherals", and the process of getting data from them (ie. "input") or passing data to them (ie. "output") is often abbreviated to "I/O". It is also convenient under this heading to introduce the topic of "backing store". This is the generic term for high capacity permanent memory devices such as magnetic tapes and disks. These are memory resources which spend most of their time "off line", until the programs or data they contain are again required.

Key Concept - Backing Store: The traditional weakness with RAM is that it is "volatile". If the power fails, then the flip-flops fail with it, and their contents are irrevocably lost. This would be fatal to the type of "brought-forward/carried forward" processing common in business batch computing, also to databases and the long processing runs often required in scientific computing applications. Instead of risking expensive data loss, therefore, data are written out to a more permanent form of storage at regular intervals.

Here are the commonest types of I/O device:

Input Devices: These are mechanisms by which programs and/or data are loaded into the machine. In the generation zero machines, these were typically an operator switching board, and both paper tape and card readers. Improvements during the 1950s gradually added magnetic tape and magnetic disk drives.

Output Devices: These are mechanisms by which data are taken out of the machine. In the generation zero machines, these were typically an operator lighting panel and/or display screen, printer, and both paper tape and card perforators. Improvements during the 1950s gradually added magnetic tape and magnetic disk drives.

Now the point about I/O devices is that they, too, have to be told what to do and when to do it. In fact, the design of the CPU would be impossibly complex were it not for the fact that the fine and final control of each peripheral is traditionally dealt with by a separate "Device Controller (DC)". When the program issues a READ or WRITE instruction, the Control Unit directs its control signals not to the Arithmetic/Logic Unit but to the appropriate subordinate DC via the bus (see next section). We showed this as the leftmost red arrow in Figure 1, and gave an example of just such an instruction in Part 3, in the section entitled The Cambridge University EDSAC. It read .....

O n = Print the character now set up on the teletypewriter and replace it with the character represented by the five most significant digits in storage location n.

I/O operations are actually quite significant tasks in their own right, culminating as they must in the physical activation of the appropriate external device. This is where highly miniaturised logic circuits suddenly have to generate the much larger voltages and currents required to activate physical mechanisms (and even something as small as a disk read-write head is "heavy" by comparison with the electronics we have been talking about so far).

6 - The Wiring Loom, or "Bus"

The "bus" is the wiring loom which connects all the aforementioned components together (Babbage's Module #4). The speed of address, data, or control flow within the bus depends primarily upon whether the machine is built to a "serial" or a "parallel" architecture. In a serial bus, the individual bits are transmitted from point to point one at a time along a single wire. In a parallel bus, however, several wires run side by side, and each wire each takes a bit. This allows entire bit strings to arrive at their destination simultaneously. In fact, all but the earliest computers were parallel. Figure 3 shows the final Eckert-von Neumann architecture, ca 1950 vintage, expressed diagrammatically.

Figure 3 - The Fully Operational "Generation Zero" GPC, ca 1950: Here are the components identified in Figures 1 and 2, now supported by a five-hole paper tape reader, a five-hole paper tape perforator, a printer, and an operating console. Bus interconnections are shown in red. Note how the CPU-registers are "closer" to the ALU than is the main body of primary memory, making them the fastest form of memory available.

If this diagram fails to load automatically, it may be accessed separately at

http://www.smithsrisca.co.uk/PICgpc-PERIPH.gif

References

See Main Menu File.

[Up][Home]