

# **SYST 27198**

## **CPU Architecture and Assembly**

**Dr. Rachel Jiang**

Sheridan Institute of Technologies and  
Advanced Learning

Winter 2022



# Acknowledgement

This notes is based on  
Professor Victor Ralevich's  
course Notes of "**Structured  
Computer Organization**"

**In memory of** Victor Ralevich (1952 - 2016)

Updated by Rachel Jiang

**Jan. 2017**

# **Chapter One**

# **Processor Organization**

# Learning objectives

- This lecture provides answers to the following:
  - What is **the von Neumann Architecture**?
    - Who is von Neumann and When **was the architecture invented?**
    - **What are the components that build the architecture?**
  - Has the **von Neumann** Architecture been changed till recently?
    - The evolution of the computer architecture
    - The trends in computer architecture (if time allows)
  - Which computers are made of the **von Neumann Model (Architecture)?**

# The von Neumann Model

- Most computers use the stored-program concept designed by mathematician John von Neumann (1946)



- user **stores** programs and data in a **slow-to-access storage medium** (such as a hard disk) and **works on** them in a **fast-access, volatile storage medium (RAM)**.
  - The concept has a bottleneck:
    - it was designed to process instructions one after the other instead of using faster parallel processing.

- As a 6 year old
  - divide two 8-digit numbers in his head
  - fluent in Latin and ancient Greek
  - *able to exchange jokes with his father in classical Greek*
- By the age of 8
  - he was familiar with differential and integral calculus.[\[14\]](#)
- Get answers from ChatGPT: “Who is von Neumann?”

- A von Neumann architecture computer has five parts: an **arithmetic-logic unit (ALU)**, a **control unit (CU)**, a **memory**, some form of **input/output** and a **system bus** that provides a data path between these parts



- Very few computers have a **pure von Neumann** architecture
  - Most computers add another step to check for **interrupts**, electronic events that could occur at any time.
  - Interrupts let a computer do other things while it waits for events





Use of the 8259A interrupt controller

## A logical pinout of a generic CPU



# von Neumann Architecture computer

performs/emulates the following sequence of steps:



1. Fetch the next instruction from memory at the address in the program counter.
2. Add 1 to the program counter.
3. Decode the instruction using the control unit.
4. The control unit commands the rest of the computer to perform (execute) some operation.
5. Store results if necessary
6. Go back to step 1.

# Multilevel Machines

High Level

User Level: Application Programs

High Level Languages

Assembly Language / Machine Code

Microprogrammed / Hardwired Control

Functional Units (Memory, ALU, etc.)

Logic Gates

Low Level

Transistors and Wires

Multilevel computer hierarchy

# Levels of interpretation



temp = v[k];  
v[k] = v[k+1];  
v[k+1] = temp;

lw\$15, 0(\$2)  
lw\$16, 4(\$2)  
sw \$16, 0(\$2)  
sw \$15, 4(\$2)

0000 1001 1100 0110 1010 1111 0101 1000  
1010 1111 0101 1000 0000 1001 1100 0110  
1100 0110 1010 1111 0101 1000 0000 1001  
0101 1000 0000 1001 1100 0110 1010 1111

ALUOP[0:3] <= InstReg[9:11] & MASK

# System Bus Model (Refined von Neumann's)



- **Von Neumann computers**
  - spend a lot of time moving data to and from the memory, and this slows the computer.
- **Engineers often separate the bus into two or more busses**
  - usually one for instructions, and the other for data, and the third one for memory locations' addresses.

# ARM

## Architecture

- Simplest 32-bit microprocessor in the world.
- ARM features a 32 bit data bus.
- It has sixteen 32 bit registers.
- Only 30000 transistors.
- Supports 9 stage pipelining.
- There is no cache.



# Placement of Cache in a Computer System



# Speed improvement with Cache memory

- Assume:

- Main memory access time = 2.1ns
- cache memory access time = 0.1ns
- $H = 0.89$  (hit/miss ratio)
- New avg AT =  $c + (1-h)*MAT$   
 $= 0.1 + 0.11 * 2.1 = 0.331$

# Cache Ex

- Assuming the MAT = 1.2ns
- CAT = 0.01ns
- H = 0.91
- What is the Average access Time after adding cache memory?

# Typical Computer Components

- **Central Processing Unit (CPU)**
  - (ALU and Control Unit)
- **Memory (RAM and ROM)**
- **Input/Output (I/O)**
- **System bus**
  - by which the CPU communicates with memory and I/O devices

# System bus consists of three subbuses

- **Data Bus**

- – used to transport data for read and write,

- **Address Bus**

- – used to transmit registers' addresses

- **Control Bus**

- – 'controls' the function of the other buses, i.e. used to transfer instructions

- **System Clock**

- clock generator for the processor, the buffer chips and all other components

- Different instructions require different number of clock cycles.

- The width of the **data bus** is usually used to classify a microprocessor
  - i.e, a 16-bit microprocessor has a 16-bit data bus
- The width of data bus determines how much data the processor can read or write in one memory or I/O cycle
  - The width of the CPU internal data bus is often different from its external data bus width
    - For example, Pentium processors have an external bus width of 64 bits, but internally a 32-bit data bus



source: [www.teach-ict.com](http://www.teach-ict.com)



- **The address bus**
  - is used to identify the memory location or I/O device (port) the processor intends to communicate with.
  - Its width ranges in different CPUs from 20 bits (Intel 8086 and 8080), to 36 bits (Pentium II and III), and more
- Each time the processor outputs an address to the address bus
  - it also activates one of four control bus signals:
    - **Memory read,**
    - **Memory write,**
    - **I/O read,**
    - **I/O write.**





# Brief History

## Pascal's Calculating Machine

Performs basic arithmetic operations (early to mid 1642). Does not have what may be considered the basic parts of a computer.





# 1642

At age 19, Blaise Pascal invents the first calculator: the pascaline.



# 1832

Charles Babbage invents the analytical machine.



# Analytical Engine

- The Analytical Engine was intended to use loops to control an automatic calculator, which could make decisions based on the results of previous computations.
  - This machine was also intended to employ several features subsequently used in modern computers, including sequential control, branching, and looping.
- **Charles Babbage (UK)** also known as “Father of Computing”

# 1832

## Babbage's machine



1832



1991



**Ada Augusta Byron, Countess of Lovelace, is the first to design programs using punched cards for Babbage's machine.**

# 1780

Benjamin Franklin  
discovers electricity.



# 1804

**Joseph-Marie Jacquard builds the first automatic weaving loom programmed with punched cards.**



# 1854



**Gorge Boole (UK)**  
creates the algebra  
that bears his name  
today (boolean).

| AND   | TRUE  | FALSE |
|-------|-------|-------|
| TRUE  | TRUE  | FALSE |
| FALSE | FALSE | FALSE |

| OR    | TRUE | FALSE |
|-------|------|-------|
| TRUE  | TRUE | TRUE  |
| FALSE | TRUE | FALSE |

# 1937



Claude Shannon publishes *A Symbolic Analysis of Relays and Switching Circuits*, where he shows that Boole's algebra may be used in electrical systems.

# 1937



**Alan Mathison Turing (UK)** defines the notion of algorithm and introduces the concept of the universal machine now known as « Turing machines ».

# 1939-1945

During WWII, Germany used the Enigma machine to encode its transmissions.



# 1943

A team of the British Code and Cipher School builds a machine to decode the messages. It was called **Colossus**.



# Colossus

- electronic digital computer, built during WWII.
- used to break the codes of the German Lorenz SZ-40 cipher machine
  - used by the German High Command
- used thermionic valves (vacuum tubes) to perform Boolean and counting operations.
  - is regarded<sup>[3]</sup> as the world's first programmable, electronic, digital computer
  - programmed by switches and plugs and not by a stored program.<sup>[4]</sup>

[https://en.wikipedia.org/wiki/Colossus\\_computer](https://en.wikipedia.org/wiki/Colossus_computer)

# Colossus

- Alan Turing's use of probability in cryptanalysis contributed to its design
- The use of these machines allowed the Allies to obtain a vast amount of high-level military intelligence from intercepted radiotelegraphy messages between the German High Command (OKW) and their army commands throughout occupied Europe.

# Banburismus

- a cryptanalytic process developed by Alan Turing at Bletchley Park in Britain during the Second World War. It was used by Bletchley Park's Hut 8 to help break German Kriegsmarine (naval) messages enciphered on Enigma machines
  - <https://en.wikipedia.org/wiki/Banburismus>

# Milestones in Computer Architecture History

| Year | Name              | Made by         | Comments                                |
|------|-------------------|-----------------|-----------------------------------------|
| 1834 | Analytical engine | C.Babbage       | First attempt to build digital computer |
| 1936 | Z1                | Zuse            | First working relay calc machine        |
| 1943 | COLOSSUS          | Brit.Gov.       | First electronic computing machine      |
| 1944 | MARK I            | Aiken           | First American computer                 |
| 1946 | ENIAC I           | Eckert/Mauchley | First modern type computer              |
| 1949 | EDSAC             | Wilkes          | First stored-program computer           |
| 1951 | Whirlwind I       | M.I.T.          | First real-time computer                |
| 1952 | IAS               | von Neumann     | Modern computer design                  |
| 1960 | PDP-1             | DEC             | First minicomputer                      |
| 1964 | 360               | IBM             | First computer product line             |
| 1964 | 6600              | CDC             | First scientific supercomputer          |
| 1970 | PDP-11            | DEC             | Dominating microcomputer in 1970s       |
| 1974 | 8080              | Intel           | First general-purpose 8-bit CPU chip    |
| 1974 | CRAY-1            | Cray            | First vector supercomputer              |
| 1978 | VAX               | DEC             | First 32-bit supercomputer              |
| 1981 | IBM PC            | IBM             | First personal computer                 |
| 1985 | MIPS              | MIPS            | First commercial RISC computer          |
| 1987 | SPARC             | Sun             | First SPARC-based RISC workstation      |
| 1990 | RS6000            | IBM             | First superscalar computer              |

# Computer Generations

The Zero-th Generation – Mechanical Computers (1642-1945)



Charles Babbage designed the first computer – Difference Engine, starting in 1823.

## The First Generation – Vacuum Tubes (1945 – 1955)



The ENIAC weighed 30 tons!

It was placed in a sort of *U* of 6 meters wide by 12 meters long.





Replacing a bad tube meant checking among ENIAC's 19,000 possibilities.

# Energy consumption: 200 KW!



The lights of the whole city of Philadelphia dimmed when it was turned on!

2023-09-05

49

# 1945

The first bug! - found on Sep 9<sup>th</sup> at 15:45 by Grace Hopper, then working on the Mark II computer at Harvard University.

|      |                                                                                                                          |
|------|--------------------------------------------------------------------------------------------------------------------------|
| 92   |                                                                                                                          |
| 9/9  |                                                                                                                          |
| 0800 | Anton started                                                                                                            |
| 1000 | - stopped - anton ✓                                                                                                      |
| 1300 | MP - MC<br>(033) PRO ✓<br>correct                                                                                        |
|      | { 1.2700 9.037 847 025<br>1.1681747000 9.037 846 995 correct<br>2.130476415(-2) 4.615 925 059(-2)<br>2.130676415         |
|      | Relays 6-2 in 033 failed special speed test<br>in relay 11,000 test.                                                     |
| 1100 | Started Cosine Tape (Sine check)                                                                                         |
| 1525 | Started Multi Adder Test.                                                                                                |
| 1545 |  Relay #70 Panel F<br>(Moth) in relay. |
| 1600 | First actual case of bug being found.<br>Anton started.                                                                  |
| 1700 | closed down.                                                                                                             |



The highest ranking female Navy person of her time (Rear Admiral) and a role model to thousands of young women. The bug now resides at the National Museum of American History in Washington DC.



## ENIAC I ('Programming')

ENIAC (Electronic Numerical Integrator and Calculator) was the first operational general-purpose machine built using vacuum tubes.



IBM 604 tubes and computer (1948)



## The Second Generation – Transistors (1955 – 1965)



# Transistor Operation of Inverter



# Assignments of 0 and 1 to Voltages



(At the output)



(At the input)





# Transistor AND and OR Gates



AND GATE



OR GATE



## The Third Generation – Integrated Circuits (1965 – 1980)

From Computer Desktop Encyclopedia  
Reproduced with permission.  
© 2000 Texas Instruments, Inc.





2023-09-05

60

Dr. Rachel Jiang

# The Fourth Generation Very Large Scale Integration (VLSI) (1980 - )





# Moore's Law

Gordon Moore, cofounder and Chairman of Intel

- **Moore's Law (1965) – Computing power doubles every 18 months for the same price.**



## Microprocessor Transistor Counts 1971-2011 & Moore's Law



[https://en.wikipedia.org/wiki/Moore's\\_law#/media/File:Transistor\\_Count\\_and\\_Moore%27s\\_Law\\_-\\_2011.svg](https://en.wikipedia.org/wiki/Moore's_law#/media/File:Transistor_Count_and_Moore%27s_Law_-_2011.svg)

# Moore's Law

- the number of components (transistors, resistors, diodes or capacitors) in a dense integrated circuit had doubled approximately every year (1965).
  -
- In 1975, he revised the forecast rate to approximately every two years.
  - The prediction has become a target for miniaturization in the semiconductor industry, and has had widespread impact in many areas of technological change.
    - [https://en.wikipedia.org/wiki/Gordon\\_Moore](https://en.wikipedia.org/wiki/Gordon_Moore)

# Intel Series of Processors

| Processor   | Date | Data Register Width | Address Bus | Data Bus  | Address Space | Clock Frequency |
|-------------|------|---------------------|-------------|-----------|---------------|-----------------|
| 4004        | 1971 | 4 bits              | 4 bits      | 4 bits    | 4 KB          | 740 KHz         |
| 4040        | 1972 | 4 bits              | 4 bits      | 4 bits    | 8 KB          | 740 KHz         |
| 8008        | 1972 | 8 bits              | 8 bits      | 8 bits    | 16 KB         | 500/800 KHz     |
| 8080        | 1974 | 8 bits              | 16 bits     | 8 bits    | 64 KB         | 2 MHz           |
| 8086        | 1978 | 16 bits             | 20 bits     | 16 bits   | 1 MB          | 5-10 MHz        |
| 8088        | 1979 | 16 bits             | 20 bits     | 8/16 bits | 1 MB          | 5-10 MHZ        |
| 80186       | 1982 | 16 bits             | 20 bits     | 16 bits   | 1 MB          | 10 MHz          |
| 80286       | 1982 | 16 bits             | 24 bits     | 16 bits   | 16 MB         | 8-12 MHZ        |
| 80386       | 1985 | 32 bits             | 32 bits     | 32 bits   | 4 GB          | 16-33 MHZ       |
| 80486       | 1989 | 32 bits             | 32 bits     | 32 bits   | 4 GB          | 25-100 MHZ      |
| Pentium     | 1993 | 32 bits             | 32 bits     | 64 bits   | 4 GB          | 60-233 MHZ      |
| Pentium Pro | 1995 | 36 bits             | 32 bits     | 64 bits   | 64 GB         | 150-200 MHZ     |
| Pentium II  | 1997 | 36 bits             | 32 bits     | 64 bits   | 64 GB         | 233-450 MHZ     |
| Pentium III | 1999 | 36 bits             | 32 bits     | 64 bits   | 64 GB         | 450 MHz – 1 GHz |
| Pentium IV  | 2000 | 36 bits             | 32 bits     | 64 bits   | 64 GB         | 1.3 – 2.8 GHz   |

# 4004/4040

- The first microprocessor in history, Intel 4004 was a 4-bit CPU designed for usage in calculators, or, as we say now, designed for "embedded applications".
- Clocked at 740 KHz, the 4004 executed up to 92,000 single word instructions per second, could access 4 KB of program memory and 640 bytes of RAM.

# 4004/4040 cont'd

- Intel 4004 was not very suitable for microcomputer use due to its somewhat limited architecture. The 4004 lacked interrupt support, had only 3-level deep stack, and used complicated method of accessing the RAM.

- Intel 4040 is an enhanced version of Intel 4004 microprocessor. Running at the same speed as the 4004 and being fully object compatible with it, the 4040 doubled maximum program memory, increased stack size, added an extra bank of eight 4-bit registers and included interrupt support.



Intel C4004



Intel P4040

# 8008/8080

- The first 8-bit integrated microprocessor
  - Intel 8008 was released in 1972, only 5 months after Intel 4004 microprocessor.
  - The 8008 was available in two speed grades - 500 KHz and 800 KHz, but even the faster version 8008-1 was running a bit slower (in instructions per second) than the 4004.
    - Nevertheless, overall performance of the 8008 was greater due to 8-bit architecture



Intel D8008-1

- General-purpose CPU model 8080 was introduced in 1974.
- Intel 8080 had source code compatible with the 8008. Interrupt processing logic didn't change from the 8008.

- The 8080 processor included many enhancements:
  - Maximum memory size was expanded to 64 KB.
  - Stack size was no longer limited to 7 levels.
  - The number of I/O ports was increased to 256.
  - The processor included many new instructions and direct addressing mode.



# 8086/ 8088

- In 1978/9 Intel introduced the 8086 and 8088 microprocessors extensions to 8080 product line.
  - 8086 and 8088 were binary compatible, but not pincompatible.
  - Both chips implemented a CISC design methodology.
- X86 has gone through six generations and became the most successful processor in history.
  - Clone industry: AMD, Cyrix, IBM, TI, UMC, Siemens, NEC, Harris, and others.

- Both have 20 address pins.
- 8086 has 16 bit data bus and 8088 has only 8 bit data bus.
- Intel 8088 microprocessor is almost identical to the Intel 8086 processor with the exception of the external data bus.
  - External data bus width of the 8088 was reduced to 8 bits, and instruction queue size and prefetching algorithms were changed.

- Intel 8088 used two consecutive bus cycles to write or read 16 bit data instead of one cycle for the 8086.
- This made the processor to run slower, but on the plus side the hardware changes in the 8088 CPU made it compatible with 8080/8085 peripherals.
- Architecture of the 8088 stayed the same as the 8086:16-bit registers, 16-bit internal data bus and 20-bit address bus, which allowed the processor address up to 1 MB of memory.

- The 8088 had the same segmented memory addressing as the 8086: the processor could address 64 KB of memory directly, and to address more than 64 KB of memory one of special segment registers had to be updated.



# **80186/ 80188**

- New instructions, new fault tolerance protection.
- They are still used in disk drivers and disk controllers.
- Program, data and stack memories occupy the same memory space.
- The total addressable memory size is 1MB.
  - As the most of the processor instructions use 16-bit pointers the processor can effectively address only 64 KB of memory.

- To access memory outside of 64 KB the CPU uses special segment registers to specify where the code, stack and data 64 KB segments are positioned within 1 MB of memory



# 80286

- In 1982 Intel produced the 80286 chip. There were some significant extensions: extended instruction set, four more address lines and a new operating mode called “protected mode”.
- • New 16-bit protected mode allowed the processor to access 16 MB of memory.
- To setup the protected mode, new instructions and registers were added to the 80286.

# 80286

- Real mode was still limited to the one-megabyte program
- addressing of the 8086. Protected mode could not run real-mode (DOS) programs.

- Execution time of many instructions was reduced.
- The 80286 microprocessor was produced at speeds ranging from 4 MHz to 25 MHz.



# 80386

- In 1985, Intel introduced the 80386. Protected mode was enhanced. Address bus was extended to 32 bits allowing addressing of up to 4 GB of memory.
- Another new operating mode was introduced to allow DOS programs to execute within the protected mode environment.
- It was not used in any computer system many years after its introduction. Compaq was first one to use it in PC-s.

- 80386 SX was introduced in 1986 and 80386 was renamed into 80386 DX.
  - SX was the cost-reduced 80386 with a 16-bit data bus, and 24-bit address bus.
  - SX and DX were otherwise software compatible with each other.
- Intel also introduced 80376 as part of the 80386 family. The 80376 was an 80386 SX that exclusively ran in protected mode.
  - The memory cache was introduced for the first time.



Intel Ng80386 SX



## Growth of an address bus over time

# 80486

- Integration of the 80387 math-coprocessor into the 80486 core logic.
  - Intel introduced 80486 SX and 80486 DX.
  - They were not pin-compatible, nor 100% software compatible with each other.
- 80486 SX had the full data bus and address bus as 80486 DX, but the math co-processor logic was removed. Instead the 80487 SX “math coprocessor” for 80486 SX was introduced.

- 80487 SX was full 80486 DX with two pins relocated.
- 80486 DX2 and 80486 DX4 - doubled and tripled the core clock frequency.



- **Pentium**
  - Departure from all past x86 processors.
  - Lots of enhancements and new instructions.
  - MMX – Multi-Media eXtensions
- **Pentium PRO**
  - Introduced in November 1995 as Intel's 6th generation x86 design.
  - Minor design enhancements, four more address lines (could access up to 64 GB of main memory), and a large 2nd level cache.

- **Pentium II**
  - Pentium II is Pentium Pro with MMX extensions. New package is called “Slot – 1”.
  - Pentium II can address up to 64 GB of memory, but has cache limitations preventing memory use above 512 MB.
- **Celeron**
  - Celeron is a stripped down Pentium II. Offered without any 2<sub>nd</sub> level cache.

## ■ Pentium III

- Performance: Clock speeds of up to 1 GHz.
- New Features: 70 new instructions.
- Manageability: A new processor serial number feature can enhance system security and asset tracking.
- 512 K second level cache.
- Level 2 cache runs at the full processor speed.

## ■ Pentium 4

- The Pentium 4 processor family supporting Hyper-Threading Technology delivers Intel's advanced, powerful processors for desktop PCs and entry-level workstations, which are based on the Intel NetBurst®microarchitecture.
- The Pentium 4 processor 6xx and 3.73 GHz Extreme Edition provide flexibility for operating systems and future software with Intel Extended Memory 64 Technology(Intel® EM64T) support for 64-bit computing.

# XEON



# SPARC Processors

- 1982 – Sun Microsystems Founded
- SUN (Stanford University Network)
- SUN 1, SUN 2 and SUN 3 – used Motorola 68020 CPU, were equipped with an Ethernet connection and TCP/IP software.
- 1987 – SUN – 4 with newly designed CPU called SPARC (**S**calable **P**rocessor **A**rchitecture), based on RISC II architecture.

- MicroSPARC, HyperSPARC, SuperSPARC, and TurboSPARC CPUs, produced by different vendors, were all binary compatible and run the same user programs without modification.
- Initial SPARC was a full 32-bit machine, running at 36 MHz, with only 55 basic instructions, and additional 14 for floating-point unit.
- In 1995 UltraSPARC I workstation was introduced as the first workstation using SPARC V9 CPU with full 64 bit architecture with 64-bit addresses and 64-bit registers.
- Address space for UltraSPARC I, II and III processors is up to 2TB.

# Topics discussed:

- What is **the von Neumann** Model/architecture?
- Who is von Neumann?
- When was the von Neumann **architecture** invented?
- Has the **von Neumann** Architecture been changed till recently?
  - The evolution of the computer architecture
  - The trends in computer architecture (if time allows)
- Which computers are made from **The von Neumann Model**?

# Questions?

- ???

# Inclass ex

- Run the assembly language program in emu8086 (see lab2 part 1)
  - The program will add two numbers stored in two variables named as x, and y.