

Task Takis  
for Lab  
Manual

### Week 1: Digital Hardware Review

- pages 1-5
- early release!

### Week 2: Digital Hardware Review, page 6

#### Digital Systems Review

- "combinational logic"
- pages 1-5

#### LAB:

- VHDL Tutorial, pages 1-7 → emphasize structural implementation
- early release
- introduce software (Web Edition Xilinx Simulator for Ubuntu) → get details from Takis
- ask them to Start Lab 1: Full Adder

### Week 3:

#### Digital Logic Design Simplifying Strategies

- $(A+A)B$
- multi-level combinational logic
- bubble pushing
- pages 6, 7
- begin (officially) Lab 1

### Week 4

#### Digital Systems Cont'd

- floating value (tri-state logic)
- K-Maps
- multiplexors, multiplexer logic
- pages 8-11
- Lab 1 turned in, Lab 2 begin

Work 4 be  
accept  
root this  
time

Week 5Timing Issues and Sequential Logic

- $t_{pd}$ ,  $t_{cd}$
- g latches
- SR - Latch
- D - Latch
- D - Flip Flop
- pages 12 - 14

LAB: Lab 2 continues  
 give 2 weeks → Lab 3  
 - cover implementation  
 of registers in VHDL

Week 6Sequential / Synchronous Logic Design.

- race conditions, critical paths
- key attributes
- pages 15 - 16

- intro to Finite State Machines
- pages 1 - 4

LAB: Lab 3 begins, Lab 2 turned in

Week 7:Finite State Machines Cont'd

- implementation of FSM (page 5)
- Moore vs. Mealy
- Robot Snail Examples, pages 6 - 8

LAB: Lab 3 continues

(3)

Week 8: • introduce embedded hardware  
kits

→ Beaglebone Kit

- device tree
- GPIOs
- device tree overlays
- ask Takis for GPIO Lab

LAB: Lab 3 is turned in, start  
GPIO Lab on Beaglebone

Week 9: • Continue with Embedded Linux  
hardware

= ADC accessing on Beaglebone using sysfs

LAB: Finish up GPIO, start  
triple axis accelerometer lab,  
ask Takis for ADC Lab

Week 10: • Stored Program Computer Architectures

→ page 9, 10 TEST #1 in Lecture (10%)  
— Methods of Integration: pages 10, 11

Lab: finish ADC Lab, start "Lab 4"

Week 11:

Memories

• Overview: arrays, bit cells, DRAM, SRAM

• ALU (page 4) → pages 4, 5 so they can do Lab 5

Lab: turn in Lab 4, begin Lab 5

Week 12:Memories Cont'd

- register file, ROM; pages 5-8
- Programmed I/O

- LAB

- turn in 5 - Pages 1-3

Start Lab 6

Week 13:ARM 32-bit Micro Architecture & Architecture

- Intro - pages 5, 6
- Architecture (Assembly Language)
  - pages 7-9
- LAB: turn in Lab 6, start Lab 7

Week 14:ARM Architecture Cont'd

- pages 9 -
- programming 11 -
- LAB: finish up

Week 15:LAB/Review Session

- final test 15%

## SUBMISSIONS

- 4 page max (single sided) — two-column
- hardcopy

## PROCEDURES

- ask them to tidy up

# Lambton College

School of Technology, Energy &  
Apprenticeship

International

## Course Outline – Winter 2017

Course Code: ESE 2005

Course Title: Embedded Systems Architecture I

Prepared By: Jay Nadeau

Date: April 2016

Revised By: [Click here to enter name.](#)

Date: [Click here to enter a date.](#)

Approved By: [Click here to enter name.](#)

Prerequisite: ESE 1005, ESE 1014

Corequisite: [Click here to enter corequisites or type None.](#)

Prerequisite for: ESE 3005

## **1. Course Description**

Students are introduced to the Von Neumann and Harvard processor architectures. Students will recognize the differences between Microprocessors, Microcontrollers and System-on-Chip/System-on-Module, and the differences between Intel and ARM processors. Students are introduced to the building blocks of microcomputers and microprocessors such as memory, I/O, registers, the ALU and the control unit. The design of computer instruction sets and CPUs are reviewed with students designing systems that use peripherals such as real-time clocks, analog-to-digital converters, digital-to-analog converters and interfacing with GPIO pins. Students will write applications that use the ARM and Thumb instruction sets.

## **2. General Education and Essential Employability Skills**

This course provides the following provincial Essential Employability Skills:

- #1: Communication
- #2: Numeracy
- #3: Critical Thinking and Problem Solving
- #4: Information Management
- #5: Interpersonal
- #6: Personal

### **Is this course approved as a General Education course?**

No

Yes

Students should refer to their program's restricted General Education courses for final determination.

## **3. Learning Outcomes**

Upon successful completion of this course, the student will be able to:

1. Describe differences between Von Neumann and Harvard computer architectures.
2. List differences between microprocessors, microcontrollers and SoCs/SoMs.
3. Describe the different types of computer memory.
4. Explain the functions of the I/O bus, the ALU, the control unit and the register file within the CPU.
5. Discuss differences between microcode, machine language, assembly language and high-level languages.
6. Explain what a RISC computer is.
7. List differences between Intel and ARM processors.
8. Describe and design systems that use peripherals, GPIOs, buses, interrupts, DACs/ADCs and PWM signals.
9. Write an assembly language program using ARM and Thumb instruction sets.
10. Design a simple 8-bit Harvard architecture CPU using VHDL.

11. Design an instruction set for use on an 8-bit Harvard architecture CPU.

#### 4. Course Objectives

Learning  
Outcome  
Reference  
Number

- Unit 1 **Microcomputers, Microcontrollers and SoCs**
- 1.1 Describe the Von Neumann architecture.
  - 1.2 Describe the Harvard architecture.
  - 1.3 List benefits and drawbacks of both the Von Neumann and Harvard architectures.
  - 1.4 Explain differences between a microprocessor and a microcontroller.
  - 1.5 Describe the features of a SoC/SoM.

[1]  
[1]  
[1]  
[2]  
[2]

class begins around  
Chapter 6 of Harris  
Harris

#### Unit 2 **Computer Architecture**

- 2.1 Describe the differences between RAM and ROM.
- 2.2 Describe the differences between SRAM and DRAM.
- 2.3 Describe the differences between EPROM, EEPROM, Flash, Masked ROM and PROM. ~v1.2
- 2.4 Describe programmed I/O, interrupt I/O and direct memory access.
- 2.5 List the different CPU registers. 494-495
- 2.6 Describe the functions of the ALU.
- 2.7 Explain the main purpose of the control unit.
- 2.8 Describe the purpose of an instruction set.
- 2.9 Explain the differences between microcode and assembly language.
- 2.10 Explain the relationship between machine language and assembly language.
- 2.11 Explain the relationship between a high-level programming language and assembly language.
- 2.12 Explain what a reduced instruction set computer is and the philosophy behind a RISC.
- 2.13 List advantages and disadvantages of a RISC.
- 2.14 Describe the Intel x86 CPU architecture.
- 2.15 Describe the ARM CPU architecture.

[3]  
[3]  
[3]  
  
[4]  
  
[4]  
[4]  
[5]  
[5]  
  
[5]

Week 6

[5]  
[6]  
[6]  
[7]  
[7]

Week 7

] Chapter 5, 5.5

] Week 5

Commented [CP1]: Similar as comment made above for "the difference".

#### Unit 3 **Peripheral Interfacing**

- |      |                                                                                      |     |                                                                                                      |
|------|--------------------------------------------------------------------------------------|-----|------------------------------------------------------------------------------------------------------|
| 3.1  | <i>List some examples of common computer peripherals.</i>                            | [8] | <i>- Week 3</i><br> |
| 3.2  | <i>Explain what a GPIO is.</i>                                                       | [8] |                                                                                                      |
| 3.3  | <i>Explain how GPIOs are used.</i>                                                   | [8] |                                                                                                      |
| 3.4  | <i>List the different types of system buses.</i>                                     | [8] |                                                                                                      |
| 3.5  | <i>Explain what an interrupt is.</i>                                                 | [8] |                                                                                                      |
| 3.6  | <i>Explain the difference between a hardware interrupt and a software interrupt.</i> | [8] |                                                                                                      |
| 3.7  | <i>Write an interrupt service routine.</i>                                           | [8] |                                                                                                      |
| 3.8  | <i>Describe the function of DACs.</i>                                                | [8] |                                                                                                      |
| 3.9  | <i>Describe the function of ADCs.</i>                                                | [8] |                                                                                                      |
| 3.10 | <i>Explain what the sampling rate of an ADC is.</i>                                  | [8] |                                                                                                      |
| 3.11 | <i>Explain the term quantization with respect to ADC.</i>                            | [8] |                                                                                                      |
| 3.12 | <i>Explain what pulse width modulation is.</i>                                       | [8] |                                                                                                      |
| 3.13 | <i>Explain the term duty-cycle with respect to PWM.</i>                              | [8] |                                                                                                      |

#### **Unit 4 Programming**

- |     |                                                                                              |     |                                                                                                         |
|-----|----------------------------------------------------------------------------------------------|-----|---------------------------------------------------------------------------------------------------------|
| 4.1 | <i>Write an assembly language program that uses ARM instructions.</i>                        | [9] | <i>- Week 10</i><br> |
| 4.2 | <i>Write an assembly language program that uses Thumb instructions.</i>                      | [9] |                                                                                                         |
| 4.3 | <i>Explain the difference between the ARM instruction set and the Thumb instruction set.</i> | [9] |                                                                                                         |

#### **Unit 5 CPU and Instruction Set Design**

- |     |                                                                                                                     |      |                                                                                                                                 |
|-----|---------------------------------------------------------------------------------------------------------------------|------|---------------------------------------------------------------------------------------------------------------------------------|
| 5.1 | <i>Design and implement an ALU using VHDL.</i>                                                                      | [10] | <br><i>- Week 11</i><br><br><i>- Week 12</i> |
| 5.2 | <i>Design and implement a register file using VHDL.</i>                                                             | [10] |                                                                                                                                 |
| 5.3 | <i>Design and implement a control unit using VHDL.</i>                                                              | [10] |                                                                                                                                 |
| 5.4 | <i>Design a simple 8-bit Harvard architecture CPU using the previously designed ALU, register and control unit.</i> | [10] |                                                                                                                                 |
| 5.5 | <i>Design an instruction set that makes use of the simple 8-bit Harvard microprogrammed CPU.</i>                    | [11] |                                                                                                                                 |

### **5. Resources and Supplies**

#### **a. Required**

Digital Design and Computer Architecture: ARM Edition by Sarah Harris and David Harris; Morgan Kaufmann, (May 2016).

**b. Supplemental**

None

**6. Methodology**

*The course is designed to provide an emphasis on hands on experience via laboratory sessions and assignments. The labs will make use of a modern embedded IDE to implement laboratory programs on an ARM based processor.*

**7. Student Evaluation**

The following elements will determine the student's final grade:

| Evaluation Method                  | Weight      | Course Learning Outcome |
|------------------------------------|-------------|-------------------------|
| Tests (#1@10%, #2@15%)             | 25%         | 1-11                    |
| Assignments (3 @ 5% each)          | 15%         | 1-11                    |
| Laboratory Sessions (10 @ 6% each) | 60%         | 1-11                    |
| <b>Total</b>                       | <b>100%</b> |                         |

The round off mathematical principle will be used. Percentages are converted to letter grades and grade points as follows:

| Mark (%) | Grade | Grade Point | Mark (%) | Grade | Grade Point |
|----------|-------|-------------|----------|-------|-------------|
| 94-100   | A+    | 4.0         | 67-69    | C+    | 2.3         |
| 87-93    | A     | 3.7         | 63-66    | C     | 2.0         |
| 80-86    | A-    | 3.5         | 60-62    | C-    | 1.7         |
| 77-79    | B+    | 3.2         | 50-59    | D     | 1.0         |
| 73-76    | B     | 3.0         | 0-49     | F     | 0.0         |
| 70-72    | B-    | 2.7         |          |       |             |

**8. Academic Integrity**

Lambton College is committed to high ethical standards in all academic activities within the College, including research, reporting and learning assessment (e.g. tests, lab reports, essays).

The cornerstone of academic integrity and professional reputation is principled conduct. All scholastic and academic activity must be free of all forms of academic dishonesty, including copying, plagiarism and cheating.

Lambton College will not tolerate any academic dishonesty, a position reflected in Lambton College policy. Students should make themselves familiar with the Students Rights and Responsibilities Policy, located on the MyLambton website for details concerning academic dishonesty and the penalties for dishonesty and unethical conduct.

Questions regarding this policy, or requests for additional clarification, should be directed to the Lambton College Centre for Academic Integrity

## **9. Related Items**

### ***Students with Disabilities***

If you are a student with a disability please identify your needs to the professor and/or the Accessibility Centre so that support services can be arranged for you. You can do this by making an appointment at the Accessibility Centre, Room L103 ext.3427 or by arranging a personal interview with the professor to discuss your needs.

### ***Student Rights and Responsibility Policy***

*Acceptable behaviour in class is established by the instructor and is expected by all students. Any form of harassment or violence will not be tolerated. Action will be taken as outlined in Lambton College policy.*

*Cheating and plagiarism are serious academic offences subject to disciplinary action. It is the student's responsibility to be aware of the cheating policy as described in the Lambton College Student Rights and Responsibilities policy. For further information on all of these policies, links may be found on the Lambton College website.*

### ***Prior Learning Assessment Statement***

*This course is eligible for Prior Learning Assessment*

Yes       No

*If yes has been selected, you may choose to contact the Counselling Department for advice on Prior Learning Assessment.*

### ***Date of Withdrawal without Academic Penalty***

*Please consult the Academic Regulations and Registrar's published dates.*

### ***Waiver of Responsibility***

*Every attempt has been made to ensure the accuracy of this information as of the date of publication. The content may be modified, without notice, as deemed appropriate by the College.*

*Note: It is the student's responsibility to retain course outlines for possible future use to support applications for transfer of credit to other educational institutions.*

## DIGITAL HARDWARE REVIEW

- our goal is to go from theory to the architecture of modern embedded processors
- assuming this is not the first time you've seen digital logic presented  
→ informal survey/histogram



- KEY ASPECT → A WAY TO DIGITAL DESIGN IS ABSTRACTION



THIS CLASS

SCOPE OF EMBEDDED PROGRAM

## DIGITAL HARDWARE REVIEW

- Much of this program what makes it of cutting edge is the foregrounding of Linux as the context of embedded design
  - is Linux the only solution?
  - what does running Linux demand of the hardware?
  - not every embedded processor is capable of running Linux - they lack other resources (memory, processor speed)
  - you should be aware that you don't need Linux or a powerful processor, but studying Linux systems should prepare you to handle complex embedded applications
  - looking at ARM NRM processors

also allows you to your  
skills in  
the domain

Cortex-M      Cortex-R      Cortex-A

performance

power consumption

memory intensive

- why is Linux a great solution, possibly the future

## DIGITAL HARDWARE REVIEW

- in addition to abstraction digital hardware designers employ "discipline":
  - intentionally restricting their design choices so that they can work more efficiently at a higher level of abstraction
  - e.g. digital circuits are much easier to design than analog circuits, if we simple and by restricting ourselves to simple digital numbers, we can more easily design systems
  - great for production, perhaps not so great for innovation
- the three "y's"
  - hierarchy - subdivision of system into modules
  - modularity - defining modules and interfaces for easy interconnection
  - regularity - striving for uniformity of the modules
- digital representation itself reflects some of this thinking — it gives us a disciplined means of working with numbers (not as accurate as other real numbers) that leads to some degree of modularity and regularity in the computing hardware

| DECIMAL | BINARY  |
|---------|---------|
| 0       | 0 0 0 0 |
| 1       | 0 0 0 1 |
| 2       | 0 0 1 0 |
| 3       | 0 0 1 1 |
| 4       | 0 1 0 0 |
| 5       | 0 1 0 1 |
| 6       | 0 1 1 0 |
| 7       | 0 1 1 1 |
| 8       | 1 0 0 0 |
| 9       | 1 0 0 1 |
| 10      | 1 0 1 0 |
| 11      | 1 0 1 1 |
| 12      | 1 1 0 0 |
| 13      | 1 1 0 1 |
| 14      | 1 1 1 0 |
| 15      | 1 1 1 1 |

## HEX

|   |
|---|
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| A |
| B |
| C |
| D |
| E |
| F |

a great numbering system which complements binary

(since groups of 4 bits are commonly used in computing what about different groupings like 3-bits at a time?)



32-bit word

7  
divisible by 9

|       |   |
|-------|---|
| 0 0 0 | 0 |
| 0 0 1 | 1 |
| 0 1 0 | 2 |
| 0 1 1 | 3 |
| 1 0 0 | 4 |
| 1 0 1 | 5 |
| 1 1 0 | 6 |
| 1 1 1 | 7 |

→ octal!

0 0  
HEX → BINARY  
or  
BINARY → HEX  
is easy





### older convention

k → "kilo" →  $10^3$   
 M → "mega" →  $10^6$

, but kbyte is not 1000 bytes, it's 1024 bytes =  $2^{10}$   
 similarly, megabyte is not  $10^6$  bytes, etc.  
 $2^{20} = 1048576$  bytes

sometimes you'll see

the prefixes "mebi" -  $2^{20}$  and "kibi" -  $2^{10}$ , which respectively

are

and

gibi -  $2^{30}$

tebi -  $2^{40}$

pebi -  $2^{50}$

- microprocessors handle chunks of data  
in portions called words

8-bit words } small microcontroller  
16-bit words } these days for  
32-bit words } embedded applications  
64-bit words }

- signed numbers
    - sign bits
    - two's complement
- ) review yourself

- talk about fixed-point, floating point, trade offs

### LOGIC GATES

- do NOT, Buffer, AND, OR, XOR, { up to 2-inputs  
NAND, NOR, XNOR }

$$Y = A \oplus B$$

$$Y = A \oplus B$$

- multi-input gates:

extensions: XNOR :  $Y = A_1 \oplus A_2 \oplus A_3 \oplus \dots \oplus A_N$

- beneath the abstraction:



# Digital Systems Review

## Combinational and Sequential Logic

- digital circuits are either combinational or sequential

Combinational: outputs depend only on the current values of the inputs (memoryless)

Sequential: outputs depend on both current and previous states of the inputs (have memory)

- there are both timing and functional specifications
  - for comb. logic, we note the lower and upper bounds of the overall delay from input to output
  - truth table characterization
- we begin with functional specification

combinational logic

(follows  
the rule of  
OR gate)



$$Y = F(A, B) = A + B$$

} a generalized  
two-input, single output  
comb. logic system/circuit

- may be implemented in the cost of same implementation

$$F(A, B) = A + B$$



- a full adder example: with multiple outputs



$$S = A \oplus B \oplus C_{in}$$

$$C_{out} = AB + AC_{in} + BC_{in}$$

will learn in Chapter 5

(2)

to simplify  
note slash notation:



multiple lines of input or output are called "buses"  
(singular: "bus")

- we can interconnect combinational logic circuits to create new combinational logic circuits, bearing in mind that there can be no cyclic paths, meaning every path through the circuit visits each circuit node at most once.

### examples:



NOT COMBINATIONAL



COMBINATIONAL



NOT COMBINATIONAL

watch for algebraic loops in systems — every loop must contain some delay (either implicitly or explicitly).

### BOOLEAN EQUATIONS

#### terminology

$\bar{A}$

- the complement or inverse of A
- the variable or its complement are called literals

$\bar{A}B, A\bar{B}\bar{C},$   
 $B, \bar{B}$

- the AND of one or more literals is called a product or implicant

$A+B, A+\bar{C},$   
 $\bar{A}+B, \bar{A}+\bar{C}$

- the OR of one or more literals is called a sum.

the "true form" of A

$A\bar{B}\bar{C}$ ,  $\bar{A}BC$ ,  
 $A\bar{B}C$   
(3-input function)

a minterm is a product involving all inputs of the function

$A+\bar{B}+C$ ,  $A+B+C$ , a maxterm is a sum involving all inputs to the function  
 $C+A+\bar{B}$   
(3 - input function)

A truth table with  $N$  inputs contains  $2^N$  rows.

\* - we can associate a minterm with each row that is true \*  
 $\rightarrow$  ~~Sum~~ sum-of-products form

### EXAMPLE

Suppose for:



we have

| $A \setminus B$ | $Y$ | min term         | name  |
|-----------------|-----|------------------|-------|
| 0 0             | 0   | $\bar{A}\bar{B}$ | $m_0$ |
| 0 1             | 1   | $\bar{A}B$       | $m_1$ |
| 1 0             | 0   | $A\bar{B}$       | $m_2$ |
| 1 1             | 0   | $AB$             | $m_3$ |

thus,  

$$Y = \bar{A}B$$

if the truth table is amended so that  $AB$  also yields  $Y=1$ , then we have

$$Y = \underbrace{\bar{A}B}_{m_1} + \underbrace{AB}_{m_3}$$

$\times F(A, B) = \underbrace{\sum (m_1, m_3)}_{\text{Sum-of-products form}} \text{ or } \underbrace{\sum (1, 3)}_{AB + A\bar{B}}$

Sum-of-products express

EXAMPLE Random 3 input - truth table

| A | B | C | Y |
|---|---|---|---|
| 0 | 0 | 0 | 1 |
| 0 | 0 | 1 | 0 |
| 0 | 1 | 0 | 0 |
| 0 | 1 | 1 | 0 |
| 1 | 0 | 0 | 1 |
| 1 | 0 | 1 | 1 |
| 1 | 1 | 0 | 0 |
| 1 | 1 | 1 | 0 |

$m_0$   
 $m_1$   
 $m_2$   
 $m_3$   
 $m_4$   
 $m_5$   
 $m_6$   
 $m_7$

$$F(A, B, C) = \sum(0, 4, 5)$$

but for  $Y = (ABC) + (\bar{A}BC) + (\bar{A}\bar{B}C)$

- \* we can also associate a max term that is false with each row that is false \*

### EXAMPLE

| A | B | Y | max term                                  |
|---|---|---|-------------------------------------------|
| 0 | 0 | 0 | <del><math>A + B</math></del>             |
| 0 | 1 | 1 | <del><math>\bar{A} + B</math></del>       |
| 1 | 0 | 0 | <del><math>A + \bar{B}</math></del>       |
| 1 | 1 | 1 | <del><math>\bar{A} + \bar{B}</math></del> |

| Max term            |
|---------------------|
| $A + B$             |
| $\bar{A} + B$       |
| $A + \bar{B}$       |
| $\bar{A} + \bar{B}$ |

| Max term name |
|---------------|
| $M_0$         |
| $M_1$         |
| $M_2$         |
| $M_3$         |

check  $\therefore Y = (A+B)(\bar{A}+B) = \prod(M_0, M_2) = \prod(0, 2)$

✓

### BOOLEAN ALGEBRA

- we are interested in using Boolean algebra to simplify the expressions we will using sum-of-products or product-of-sums rules.

product of sums

### AXIOMS

| AXIOM |                              |
|-------|------------------------------|
| A1:   | $B = 0 \text{ if } B \neq 1$ |
| A2:   | $\bar{0} = 1$                |
| A3:   | $0 \cdot 0 = 0$              |
| A4:   | $1 \cdot 1 = 1$              |
| A5:   | $0 \cdot 1 = 1 \cdot 0 = 0$  |

### DUAL

|      |                              |
|------|------------------------------|
| A1': | $B = 1 \text{ if } B \neq 0$ |
| A2': | $\bar{1} = 0$                |
| A3': | $1 + 1 = 1$                  |
| A4': | $0 + 0 = 0$                  |
| A5': | $1 + 0 = 0 + 1 = 1$          |

NAME  
binary field  
NOT  
AND/OR  
AND/OR  
AND/OPR

we use the  $p^n m^q$  rule

to derive the dual of a statement

# THEOREMS OF ONE VARIABLE

## THEOREM

- T1:  $B \cdot 1 = B$
- T2:  $B \cdot 0 = 0$
- T3:  $B \cdot B = B$
- T4:  $B \cdot \overline{B} = 0$
- T5:  $B \cdot \overline{B} = 0$

$$\overline{\overline{B}} = B$$

## DUAL

- T1':  $B + 0 = B$
- T2':  $B + 1 = 1$
- T3':  $B + B = B$
- T5':  $B + \overline{B} = 1$

## NAME

identity  
null element  
idempotency  
involution  
complements

## APPLICATIONS:



? two possible realizations of a buffer

also :



# THEOREMS OF SEVERAL VARIABLES

## THEOREM

- T6:  $B \cdot C = C \cdot B$
- T7:  $(B \cdot C) \cdot D = B \cdot (C \cdot D)$
- T8:  $BC + BD = B(C+D)$
- T9:  $B(B+C) = B$
- T10:  $(BC) + (B\bar{C}) = B$
- T11:  $BC + \overline{B}D + CD = BC + BD$

$$T12: \overline{B_0 B_1 B_2 \dots} = \overline{B_0} + \overline{B_1} + \overline{B_2} + \dots$$

## DUAL

- T6':  $B + C = C + B$
- T7':  $(B+C)+D = B+(C+D)$
- T8':  $(B+C)(B+D) = B+(CD)$
- T9':  $B+(BC) = B$
- T10':  $(B+C)(B+\bar{C}) = B$
- T11':  $(B+C)(\bar{B}+D)(C+D) = (B+C)(B+D)$
- T12':  $\overline{B_0 + B_1 + B_2 + \dots} = (\overline{B_0}, \overline{B_1}, \overline{B_2}, \dots)$

## NAME

commutativity  
associativity  
distributivity  
covering  
combining  
consensus

De Morgan's Theorem

two-input realization of De Morgan's Theorem

$$Y = \overline{AB} = \overline{A} + \overline{B}$$

$$Y = \overline{A+B} = \overline{A} \cdot \overline{B}$$



Nand gate is equivalent to

OR gate with inverted inputs

NOR gate is equivalent to  
NAND gate with inverted inputs

- can use the theorem to move between sum-of-products and product-of-sums

Theorems, unlike axioms, can be proven.

Especially in the case of a finite set of variables one can prove a theorem using a truth table — such a proof is called "perfect induction".

## SIMPLIFYING STRATEGIES

Ex.

$$Y = \overline{A}B + AB$$

$$\begin{aligned} &= (\underbrace{\overline{A} + A}_1) B \\ &\quad (\text{complements}) \\ &= B \end{aligned}$$



→ the basic approach to simplifying sum-of-products is to use the relationship  $PA + \overline{P}A = P$ , where  $P$  may be any implicant

ex.  $Y = \overline{A} \overline{B} \overline{C} + A \overline{B} \overline{C} + A \overline{B} C$

(1) Distributivity  $= (\overline{A} + A) \overline{B} \overline{C} + A \overline{B} C$

(2) Boolean algebra  $= \overline{B} \overline{C} + A \overline{B} C$

~~$= (\overline{B} + B) \overline{C} + A \overline{B} C$~~

~~$= (\overline{B} \overline{C} + A \overline{B} C)$~~

~~$= \overline{B} \overline{C} + A \overline{B} C$~~

(re produce terms)

we can actually replicate terms (idempotency) ... thus

$$Y = \overline{A} \overline{B} \overline{C} + A \overline{B} \overline{C} + A \overline{B} C + A \overline{B} C$$

~~$= (\overline{A} + A) \overline{B} \overline{C} + (\overline{C} + C) A \overline{B}$~~

~~$= \overline{B} \overline{C} + A \overline{B}$~~

(which has one fewer AND operation)

Complement  
next page

Note: in CMOS logic ANDS and ORS are preferred to NANDs and NORs.

### MULTI-LEVEL COMB. LOGIC

logic in sum-of-products form is called two-level logic.

because it consists of literals connected to a level of AND gates.



### BUBBLE PUSHING

- for also CL conversion useful in analyzing Multi-level circuits.

to NAND/NOR



next page example

for a level of OR gates  
connected to a level of AND gates.

EX. 3-INPUT NOR IS

A COMPLICATED CIRCUIT  
WITH 14 GATES TO IMPLEMENT  
AS SUM OF PRODUCTS...

IT CAN BE SIMPLIFIED  
BY FIRST USING ASSOCIATIVITY:

$$Y = A \oplus B \oplus C$$

$$= (A \oplus B) \oplus C$$



to NAND/NOR

analyzing Multi-level

bubbles and gates added which do not change the logic of logic circuit. (black original)

## ILLEGAL VALUES : X

(11)



circuit w/ contention,  
an ERROR  
that must be resolved.  
SOLVED.

## FLOATING VALUE : Z

- a node is neither being driven high nor low — it is said to be floating or in a high-impedance state.
- not necessarily an error, but there must be resolution of the node value once circuit operation is required.

## EX. TRI-STATE BUFFER

(3)



| E | A | Y |
|---|---|---|
| 0 | 0 | Z |
| 0 | 1 | Z |
| 1 | 0 | 0 |
| 1 | 1 | 1 |

} only "works"  
in the logic sense  
when E is HIGH.

→ can be used to resolve contentions



a multiplexor

## KARNAUGH MAPS

- another innovation from Bell Labs (circa 1950s)
- based on the idea that  
 $PA + P\bar{A} = P$  ( $A$  is eliminated)

- rules: ① adjacent squares must only differ by a single variable — need to use Gray codes, e.g., rather than

00, 01, 10, 11

use:

~~00, 01, 11, 10~~

or for 3-bit sequence:

000, 001, 010, 011, 100, 101, 110, 111

use: 000, 001, 011, 010, 110, ~~100~~, 111, 101, 100

(in other words try to make the minimum collection of ones instead of many ones)

(generalizes)

- circle groups of adjacent ones, using the fewest circles needed to cover all the ones
  - make circles as large as possible, with only ones inside
  - circles must be of sizes which are based on powers of 2 (e.g., 1x2, 1x4, 2x2, 4x4, etc.)
- circles may wrap around the edges
- a one may be circled multiple times
- read off the implicant corresponding to each circle, ignoring the literals whose values change

Ex. ↴

5



$$Y = \bar{A}\bar{B} + \bar{A}\bar{C}\bar{D} + \bar{A}C\bar{D} + \bar{B}\bar{C}$$

\* DON'T CARES CAN BE TREATED AS ONES, ALLOWING FOR LARGER CIRCLES (AND FEWER CIRCLES) → RESULTING IN GREATER LOGIC MINIMIZATION.

### MUXES

- multiplexor also called a "mux"
- can be realized using combinational logic.

with

- 2 data inputs
- 1 select input
- 1 output Y

MUX is based on  
2 data inputs:

$$\begin{cases} S=0, Y=D_0 \\ S=1, Y=D_1 \end{cases}$$



| S | $D_1$ | $D_0$ | Y |
|---|-------|-------|---|
| 0 | 0     | 0     | 0 |
| 0 | 0     | 1     | 1 |
| 0 | 1     | 0     | 0 |
| 0 | 1     | 1     | 1 |
| 1 | 0     | 0     | 0 |
| 1 | 0     | 1     | 1 |
| 1 | 1     | 0     | 1 |
| 1 | 1     | 1     | 1 |

$$\therefore Y = SD_0 + \bar{S}D_1$$



- can also be implemented using tristate logic

## MULTIPLEXOR LOGIC

- in general, a  $2^N$ -input multiplexor can be programmed to perform any  $N$ -input logic function ...

EX. using an 8:1 mux, implement the following function:

$$Y = AB + \bar{B}\bar{C} + \bar{A}\bar{B}C$$



→ called Alyssa's implementation

truth table:

| A | B | C | Y |
|---|---|---|---|
| 0 | 0 | 0 | 1 |
| 0 | 0 | 1 | 0 |
| 0 | 1 | 0 | 0 |
| 0 | 1 | 1 | 1 |
| 1 | 0 | 0 | 0 |
| 1 | 0 | 1 | 1 |
| 1 | 1 | 0 | 1 |
| 1 | 1 | 1 | 0 |

✓

~~you can also incorporate one of literals as an input, allowing the size of the multiplexor to be cut in half.~~

DECODER essentially

(8) - essentially renumbers the minterms... using a vector-valued output...



→ can use an OR gate to realize logic based on sum-of-products...

|  | $A_1$ | $A_0$ | $Y_1$ | $Y_2$ | $Y_3$ | $Y_4$ | minterm               |
|--|-------|-------|-------|-------|-------|-------|-----------------------|
|  | 0     | 0     | 1     | 0     | 0     | 0     | $A_1 A_0$             |
|  | 0     | 1     | 0     | 1     | 0     | 0     | $A_1 \bar{A}_0$       |
|  | 1     | 0     | 0     | 0     | 1     | 0     | $\bar{A}_1 A_0$       |
|  | 1     | 1     | 1     | 0     | 0     | 0     | $\bar{A}_1 \bar{A}_0$ |

$A_1$      $A_0$   
 $\bar{A}_1$      $\bar{A}_0$   
 $A_1$      $\bar{A}_0$   
 $\bar{A}_1$      $A_0$

## TIMING



(Combinational logic is characterized by its propagation delay & contamination delay.)

- delays exist owing to gate-capacitance charge times, and the speed of charge propagation in the circuit, generally on the order of nano-seconds ( $10^{-9}$ ) or pico-seconds ( $10^{-12}$ )

$t_{pd}$ : (maximum) time from when an input changes until the outputs reach their final value

$t_{cd}$ : (minimum) time from when an input changes value until any output starts to change its value

- signals propagate according to paths, which can vary in length owing mostly to the number of intervening gates

EX.



- propagation delay is often associated with the critical path

- contamination delay is often associated with the short path

→ GLITCHES are the result of differences in path lengths, which can cause the output to change state; essentially, a single variable crosses the boundary between two implicants in the K-map; we can introduce additional logic to produce redundant implicants and eliminate the glitch if necessary (at the expense of extra hardware) - see 2.9 of text

## SEQUENTIAL LOGIC DESIGN

- logic with "states" or memory
- a fundamental building block of memory is the bistable element, a circuit with two stable states.

SR Latch



can analyze as a series of cases, as shown in your text, section 3.2.1, summarized as

| S | R | Q | $\bar{Q}$ | Notes               |
|---|---|---|-----------|---------------------|
| 0 | 0 | 0 | 1         | nothing changes     |
| 1 | 0 | 1 | 0         | Previous state lost |
| 1 | 1 | 0 | 0         | Normal use          |
| 0 | 1 | 0 | 1         |                     |

symbolically:



D Latch - separates the ideas of "what" and "when"



| CLK | D | Q | $\bar{Q}$ |
|-----|---|---|-----------|
| 0   | X | 0 | 1         |
| 1   | 0 | 1 | 0         |
| 1   | 1 | 0 | 1         |

X - don't care state

We avoid the awkward  $S=1, R=1$  case

nothing  
changes  
unless  
CLK is  
HIGH

latch is said to be  
"opaque" - it blocks new data



## D Flip-Flop - edge-triggered flip-flop



symbolically:



not



The D flip-flop copies D to Q on the rising edge of the clock, and remembers its state at all other times.

### REGISTER



- all bits are updated at the same time

- also written as:



### OTHER FLIP FLOPS:



enabled flip-flop  
enabled.



(synchronously)  
resettable  
flip-flop

(synchronously)  
resettable

## Timing considerations in sequential logic...

- Considerations
- race conditions can occur when the behaviour of the circuit depends on which path through a logic circuit is the fastest
- delay-dependent behaviour is undesirable and are considered to be malfunctions.
- cyclic paths can be challenging to design correctly! due to the fact that outputs are fed directly back into inputs.
- to AVOID, designers often insert registers into paths, effectively creating a collection of combinational circuits and registers — simpler to analyse!

Synchronous Sequential Circuits  
 If the clock is sufficiently slow so that all inputs to the registers settle before the next clock edge, ALL RACES ARE ELIMINATED



## SYNCHRONOUS SEQUENTIAL CIRCUITS

### key attributes:

- has a finite set of discrete states  $\{S_0, S_1, \dots, S_{n-1}\}$

- has a clock input, whose rising edges indicate a sequence of times at which state transitions can occur

-  $t_{pq}$ : time of propagation from CLK to Q

$t_{cq}$ : time of contamination from CLK to Q

### RULES:

- every cyclic path contains at least one register
- all registers receive the same clock signal
- every cct. element is either a register or CL
- at least one cct. element is a register
- somewhat conservative.  
(Somewhat conservative)

(16)

Ex. the simplest  
synchronous sequential,  
etc.



a flip-flop is a simplest synchronous  
it has one input, D, one clock, one output, Q, and two  
states {0,1}

# UNIT 1

## FINITE STATE MACHINES

(2)

- an approach for synchronous sequential logic
- we consider two forms:

a) Moore:

(3)

inputs  $M$

next state  
 $C$

CLK

state  
 $C$

output  
 $f$

$N$

outputs

b) Mealy:

(4)

inputs  $M$

next state  
 $C$

CLK

state  
 $C$

output  
 $f$

$N$

outputs

→ founder of Intel

- both Moore (not Gordon, but Edward) and Mealy developed Automata Theory while working at Bell Labs in the 1950s.

- Design Example (from text):

(5)

ACADEMIC



AVE.

black Box View of Finite State machine

2



- the RESET signal initializes the system to GREEN on ACADEMIC AVE and RED on BRAVADO BLVD.
- as long as traffic is present on AA, the light remains GREEN. If traffic stops on AA, the light follows by YELLOW followed by RED.
- similarly, the lights on BB remain GREEN as long as traffic is present.

### STATE TRANSITION DIAGRAM



R : RED  
G : GREEN  
Y : YELLOW

CLOCK PERIOD: 5 s.

### STATE TRANSITION TABLE

(6)

| Current state, $s$ | inputs |       | next state, $s'$ |
|--------------------|--------|-------|------------------|
|                    | $T_A$  | $T_B$ |                  |
| $s_0$              | 1      | X     | $s_0$            |
| $s_0$              | 0      | X     | $s_1$            |
| $s_1$              | X      | X     | $s_2$            |
| $s_2$              | X      | 1     | $s_2$            |
| $s_2$              | X      | 0     | $s_3$            |
| $s_3$              | X      | X     | $s_0$            |

whenever the next state  
 (X) doesn't depend  
 on a particular  
 input  
 "don't care"

binary encodings

$$S_{1:0} = \{s_0, s_1\} = \{00, 01, 10, 11\}$$

$$L_{1:0} = \{ \text{GREEN}, \text{YELLOW}, \text{RED} \}$$

TABLE WITH BINARY ENCODINGS:

| $S_1$ | $S_0$ | $T_A$ | $T_B$ | $S'_1$ | $S'_0$ |
|-------|-------|-------|-------|--------|--------|
| 0     | 0     | 1     | X     | 0      | 0      |
| 0     | 0     | 0     | X     | 0      | 1      |
| 0     | 1     | X     | X     | 1      | 0      |
| 1     | 0     | X     | 1     | 1      | 1      |
| 1     | 0     | X     | 0     | 1      | 1      |
| 1     | 1     | X     | X     | 0      | 0      |

sum of products for next state logic:

$$S'_1 = \overline{s}_1 s_0 + s_1 \overline{s}_0 T_B + s_1 \overline{s}_0 \overline{T}_B = \overline{s}_1 s_0 + s_1 \overline{s}_0 = S_1 \oplus S_0$$

(by inspection)

$$S'_0 = \overline{s}_1 \overline{s}_0 \overline{T}_A + s_1 \overline{s}_0 T_B$$

## OUTPUT TABLE

• State  $\rightarrow$  outputs

| $S_1$ | $S_0$ | $L_{A_1}$ | $L_{A_0}$ | $L_B$ | $L_{B_0}$ |
|-------|-------|-----------|-----------|-------|-----------|
| 0     | 0     | 0         | 0         | 0     | 0         |
| 0     | 1     | 0         | 1         | 0     | 0         |
| 1     | 0     | 1         | 0         | 0     | 1         |
| 1     | 1     | 1         | 0         | 0     | 0         |

Therefore:

$$L_{A_1} = S_1$$

$$L_{A_0} = \overline{S_1}S_0 + \overline{S_1}S_0 = S_1$$

$$L_{B_1} = S_1S_0$$

$$L_{B_0} = \overline{S_1}S_0$$

## REGISTER



} on the rising edge  
of each clock,  
it copies  $S_1S_0$  to  
 $S_1S_0$  so that the  
current state becomes  
the next state

## state-machine circuit:



- note that implementation depends on the choice of state and output encodings. — therefore, therefore some encodings may produce simpler (and more cost-effective) implementations than others.

→ there is no general approach to find the best encodings, but we can use inspection and CAD tools to help us optimize.

## Moore vs. Mealy

Explained in an example)

- consider a robotic snail that is reading a binary signal painted on the floor — each time it encounters a "0" it "smiles" and it crawls along, reading in a new bit each cycle.

(2)

$s_1 s_0$

Crawl

5

# Robot Snail Example, from Textbook

input:  $A = \text{current bit read}$

output: smile / nod smile

Mode: RESET



read 0 / "not smiling"

each arc  
is labeled as A/Y

### Moore State Transition Table

| Current State<br>S | Input<br>A | Next State<br>S' |
|--------------------|------------|------------------|
| S <sub>0</sub>     | 0          | S <sub>1</sub>   |
| S <sub>0</sub>     | 1          | S <sub>0</sub>   |
| S <sub>1</sub>     | 0          | S <sub>1</sub>   |
| S <sub>1</sub>     | 1          | S <sub>2</sub>   |
| S <sub>2</sub>     | 0          | S <sub>1</sub>   |
| S <sub>2</sub>     | 1          | S <sub>0</sub>   |

with  
binary  
encoding

| S <sub>1</sub> , S <sub>0</sub> | A | S' <sub>1</sub> , S' <sub>0</sub> |
|---------------------------------|---|-----------------------------------|
| 0 0                             | 0 | 0 1                               |
| 0 0                             | 1 | 0 0                               |
| 0 1                             | 0 | 1 0                               |
| 0 1                             | 1 | 1 1                               |
| 1 0                             | 0 | 0 1                               |
| 1 0                             | 1 | 0 0                               |

$$\begin{aligned} S_0 &= 00 \\ S_1 &= 01 \\ S_2 &= 10 \end{aligned}$$

$$S'_1 = \bar{S}_1 S_0 A$$

$$\begin{aligned} S'_0 &= \bar{S}_1 S_0 \bar{A} + S_1 \bar{S}_0 \bar{A} + S_1 S_0 A \\ &= \bar{A} (\bar{S}_1 S_0 + S_1 \bar{S}_0) + S_1 S_0 A \end{aligned}$$

### Moore Output Table

| Current State<br>S | Output<br>Y |
|--------------------|-------------|
| S <sub>0</sub>     | 0           |
| S <sub>1</sub>     | 0           |
| S <sub>2</sub>     | 1           |

| S                                      | Y |
|----------------------------------------|---|
| S <sub>1</sub> , S <sub>0</sub><br>0 0 | 0 |
| 0 1                                    | 0 |
| 1 0                                    | 1 |

$$Y = S_1$$

### Mealy State Transition and Output Table

| S              | A | S'             | Y |
|----------------|---|----------------|---|
| S <sub>0</sub> | 0 | S <sub>1</sub> | 0 |
| S <sub>0</sub> | 1 | S <sub>0</sub> | 0 |
| S <sub>1</sub> | 0 | S <sub>1</sub> | 0 |
| S <sub>1</sub> | 1 | S <sub>0</sub> | 1 |

binary encodings:

| S <sub>0</sub> | A | S' <sub>0</sub> | Y |
|----------------|---|-----------------|---|
| 0              | 0 | 1               | 0 |
| 0              | 1 | 0               | 0 |
| 1              | 0 | 0               | 0 |
| 1              | 1 | 1               | 1 |

$$\begin{aligned} S'_0 &= \bar{A} \\ Y &= S_0 A \end{aligned}$$

$$(S_0 A + S_0 \bar{A}) \\ (\bar{A})(S_0 + S_0) \\ T_1$$

- additional topics:

- **FACtoring FSMs** - breaking a problem down into smaller pieces
- **breaking a problem down into simpler pieces**

- **DRAWING FSMs FROM SCHEMATICS**

- **THE DYNAMIC DISCIPLINE**

- **aperture** — (aperture)

the setup time (before the clock edge) and the hold time (after the clock edge), during which the digital signal remains stable until it has settled.

(Cycle time)

$$T_C \geq t_{PCQ} + t_{PD} + t_{Setup}$$

to satisfy the setup time of  $R_2$ .  
 $\rightarrow D_2$  must

$\uparrow$   
 clock to  
 propagation  
 delay  
 (first stage)

stable



# UNIT 1

## STORED PROGRAM COMPUTER ARCHITECTURES

two kinds we consider:

- von Neumann (Princeton) architecture
- Harvard architecture
- original programmable computer architecture (so possibly Turing's "universal computing machine", now known as the Universal Turing Machine)
- stored program computer

### ① Von Neumann



- instruction fetch and a data operation (read/write) cannot occur simultaneously because they share a common bus

- it is a simplified design compared to Harvard architecture, but this "von Neumann bottleneck" limits the speed of the machine.

## (2) Harvard architecture

- separate and independently accessible instruction and data storage



- (2) has  
• bus sizes for instruction and data memories  
can differ (as can associated word lengths)

elements of Harvard Architecture are used effectively in modern DSP and microcontrollers by companies like TI, Microchip and Atmel.

## Integration Methods



often  
flash ROM  
in a small  
amount of  
RAM

the microcontroller or MCU is an example  
of a system-on-a-chip (SoC) → a system  
integrated on a single silicon die.

- often these components are reduced in size and complexity compared with the components associated with powerful microprocessors, but can offer considerable functionality at lower cost requirements.

## Unit 2

### Memory Arrays

- building blocks that can efficiently store large amounts of data
- three types:
  - ① Dynamic Random Access Memory (DRAM)
  - ② Static Random Access Memory (SRAM)
  - ③ Read-Only Memory (ROM)

### Overview

- o a generic memory array



$2^N$  M-bit words are contained in this (B)

- memory is organized as a two-dimensional array of "memory cells"
- we read or write one row of this array, and rows are identified by an address.
- The value read or written is called data

Ex.

consists of:

2 address bits

3 data bits

2<sup>2</sup> = 4 Rows  
3 Columns



M

| Data |   |   |   |
|------|---|---|---|
| 0    | 1 | 0 |   |
| 1    | 0 | 1 | 0 |
| 0    | 1 | 0 | 0 |
| 0    | 0 | 1 | 1 |

depth

Some possible  
Content of the  
Memory Array

width

## Bit Cells

(4)

wordline

bitline

stored bit

- to read a bit, bit line is left floating, initially.
- then the wordline is turned ON, allowing the stored value to drive the bitline to 0 or 1.
- to write a bit, the bitline is DRIVEN to the desired value.
  - the wordline is then turned ON, allowing the bitline (driven strongly) to overpower the contents of the cell, writing the desired value to the stored bit.

bit cell array:

do (5)

bitline<sub>1</sub>

bitline<sub>6</sub>

bitline<sub>2</sub>

001 wordline<sub>2</sub>

0

1

0

010 wordline<sub>1</sub>

0

0

1

100 wordline<sub>0</sub>

1

1

0

"one hot" encoding

↓

need a decoder

Data<sub>2</sub>

Data<sub>1</sub>

Data<sub>0</sub>

## Memory Ports → Ports

- each port gives read / write access to one memory address — previous examples were single-ported memory arrays.
- single-ported
- an example of a memory w/ two read ports and one write port:



read data from address A1 onto the read data output  
RD1

PORT 1 : A1 + RD1

PORT 2 : A2 + RD2

PORT 3 : A3 - WD3

(writes on rising edge  
of CLK if  
WE<sup>3</sup> is asserted)

Classification:  
memory classification:

Digital  
Memory

RAM  
(random  
access  
memory)

"random access"  
means that the  
delay associated  
with accessing  
any element is  
constant

Example:

ROM  
(read-only  
memory)

In modern ROMs  
are also random  
access and  
can be written!

major  
distinction is  
that RAM is volatile (needs power)  
and ROM is non-volatile

DRAM (de=RAM) Dynamic Random access memory.

- bit value is stored on a capacitor

DRAM bit cell:



- reading destroys stored value on capacitor, so the word must be rewritten after each read, because of leakage, the charge on the capacitor must be restored every few milliseconds.



- static RAM because stored bits do not need to be refreshed — the cross coupled inverters are regenerative.

- each cell has two inputs/outputs, bitline and anti-bitline

## Volatile Memorials

- include DRAM, SRAM and type flip-flop
- |      |                    |              |
|------|--------------------|--------------|
| SRAM | xstrs per bit cell | $\approx 20$ |
| DRAM | 6                  |              |

Rig flop S,  
FF S  
latency fast  
medium  
slow



## Register File ①

- usually an SRAM array with multiple ports
- more compact than an array of flip-flops

ex. 16 register  $\times$  32 bit three ported register file



two read ports,  
one write port

## ROM

- stores a bit as the presence or absence of a transistor
- to read the cell, the bit line is weakly pulled high, then the word line is turned ON
- if a transistor is present, it pulls the bit line LOW; if absent, the bit line remains HIGH.

schematics:

bit cell containing 0:

bit cell containing 1:



we can represent the contents of a ROM using dot notation (Content)

Ex. 4-word by 3-bit ROM



| address | contents (data) |
|---------|-----------------|
| 0 0     | 0 1 1           |
| 0 1     | 1 1 0           |
| 1 0     | 1 0 0           |
| 1 1     | 0 1 0           |

ROMs can be used to perform combinational logic functions using this technique.



- also, a ROM can be realized using combinational logic
- we can build the above 4-word 3-bit example as



## PROM

### (4) - programmable ROM

- a one-time programmable ROM uses fuses which are blown (permanently opened) using a high strength signal:



### - other variations:

- EEPROM - can use UV light to turn off transistors (N<sup>+</sup> EEPROM)  
(electrically erasable)
- EEPROM and Flash memories use on-chip circuitry for programming bit cells.

Home work: 5.55, 5.57

Flash memory (unlike EEPROMs) does not have to be completely erased before being rewritten

NAND flash -  
 memory is read and written in blocks (which are much smaller than the total memory)  
 (Corrosion X)  
 not a random access memory,  
 it is more like a mass storage device  
 - high density

NOR flash -  
 allows a single word to be read or written independently  
 (but has long erase and write times)

- good when rare updates are needed (as in the case of BIOS or boot loader code)  
 boot loader

don't write X - all flash has finite durability from as few as 100 up to 1000 000 erase cycles

→ flash memory must be managed to spread out erasure of blocks evenly in order to maximize the lifetime of the device.

(4)

\*do this before  
Programmed  
I/O

for Lab 5

# UNIT 3 The ALU (Arithmetic/Logical Unit)



| ALU Control <sub>50</sub> | Function |
|---------------------------|----------|
| 00                        | Add      |
| 01                        | Subtract |
| 10                        | AND      |
| 11                        | OR       |

the list of typical functions that  
ALU can perform:

Implementation of the ALU:



Contains:

- N-bit adder
- N two input AND & OR gate
- inverters
- multiplexer to invert inputs when the ALU Control is asserted.

(A 4-to-1 multiplexer chooses the desired function based on ALU Control)

- ② some ALUs produce output flags, such as

o Overflow  
Carry  
Negative  
Zero

conditions detected using combinational logic circuits added to the basic  $N$ -bit ALU design, above.

carry  
overflows  
becomes  
a flag



- ARM 32-bit Micro Architecture - a big FSM (or collection of FSMS!)
- ① we begin with state elements — notice how there are separate instruction and data memories



(6)

- the processor microarchitecture is built to realize various instructions

## UNIT 2

### ARM Architecture

- we're going to jump up a couple levels in the hierarchy to examine architecture hierarchy
- the architecture is the "programmer's view" of the computing machine
- each processor has a native language comprised of words we call instructions which are themselves composed of bits (ones and zeros), like words in a language are composed of letters.
- the instruction set is the complete set of instructions used by an architecture, and in the computing world we have two major categories of architecture, distinguished by their instruction sets, as :

① CISC - complex instruction set computer

② RISC - reduced instruction set computer

our focus

- each of these represents a distinct philosophy, and the goal ultimately is allowing programs to execute as quickly as possible sometimes CISC seems right, other times RISC leads the way ...
- the instructions can be expressed in binary form, a what is known as machine language; alternatively, because the binary form is challenging for humans to remember, we can use mnemonics, which are simple English-word versions of each instruction

- Therefore, assembly language is simply the human-readable version of machine language.

### EXAMPLES

#### High-level Code

$a = b + c;$

$a = b - c;$

$a = b + c - d; // \text{single-line comment}$

Register operands.

$a = b + c;$

High-level code.

$a = b - c;$

$f = (g + h) - (i + j);$

Immediate operands.

$a = a + 4;$

$b = a - 12;$

if (apples == oranges)

$f = i + 1;$

$f = f - i;$

#### ARM Assembly Code

ADD a, b, c

SUB a, b, c

ADD t, b, c  
SUB a, t, d

$t = b + c$   
 $a = t - d$   
↑  
single-line  
comment

; R0 = a, R1 = b, R2 = c

ADD R0, R1, R2; a = b + c

; R0 = a, R1 = b, R2 = c  
R3 = f, R4 = g, R5 = h

; R6 = i, R7 = j

SUB R0, R1, R2; a = b - c  
ADD R8, R4, R5; RB = g + h  
ADD R9, R6, R7; RQ = i + j  
SUB R3, R8, R9; R3 = f

; R7 = a, R8 = b  
ADD R7, R7, #4; a = a + 4  
SUB R8, R7, #0x0; C = 5 - a - 12

; R0 = apples, R1 = oranges  
R2 = f, R3 = i

CMP R0, R1; if apples == oranges?  
BNE L1; if NO, skip next  
ADD R2, R3, #1; f = i + 1

L1 SUB R2, R2, R3; f = f - i

- while CISC machines tend to make it easier to do more at the machine language level, RISC machines are based on a different philosophy:

- ① regularity supports simplicity → ARM: consistent number of operands (three - one destination and two source operands)
- ② make the common case fast
- ③ smaller is faster → ARM: the fewer registers we specify, the less they can be accessed
- ④ good design demands software compromises.

ARM:  
united  
only single,  
commonly  
used in structures

READ: 6.1 → 6.3

## Operands: Registers, Memory, and Constants

- 16 ARM registers

R0 - R12 : variable storage

R13 : SP (stack pointer)

R14 : LR (link register)

R15 : PC (program counter)

- constants or immediates also allowed; values come directly from the instruction itself.

Ex.

$i = 0$   
 $x = 4080$

; R4 = i, R5 = x

→ MOV R4, #0 ; i = 0  
MOV R5, #0x5f0; x = 4080

more instruction

- ARM provides a load register, LDR, to read a data word from memory into a register

High-Level Code

$a = \text{mem}[2];$

ARM Assembly Code

;  $R7 = a$

MOV R5, #0; base address = 0  
LDR R7, [R5, #8]; ;

$\Rightarrow R7 \leftarrow \text{data at memory address } (R5+8)$

Q: why +8?

- ARM uses the store register, STR instruction, to write a data word from

HCL

$\text{mem}[5] = 42;$

AAC

MOV R1, #0; base add=0  
MOV R9, #42  
STR R9, [R1, #0x14]; ;

$\Rightarrow \text{value stored at mem. add. } (R1+20) = 42$

- Word-alignment for LDR and STR instructions as required by ARM → this means that the word address is divisible by 4 (there are also LDRB and STRB for "load byte" and "store byte" which do not need to be word-aligned)

→ to access bytes of individual words, two forms of addressing are used:

"big endian" and

from Jonathan

"little endian"

taken from Swift's Gulliver's Travels novel.

→ system on module.  
an SoM generally has higher performance than  
one SoC because it is like a dedicated (CPU)  
design but rather than made in a PCB, <sup>Printed circuit</sup>  
the components are integrated into a common,  
small package.

- can reduce cost in one sense by  
eliminating the main PCB (printed circuit board)  
*(short)*
- short length interconnects allow for both  
high performance - the benefits of both  
SoC and dedicated MP designs.

# UNIT 2 / UNIT 3

①

## PROGRAMMED I/O

### Memory-mapped I/O

- a portion of the processor's physical address space is dedicated to I/O devices rather than memory

(rather)

- to realize memory-mapped I/O, we make use of an address decoder, registers and a multiplexer.

EX



- software that allows communication with an I/O device is called a device driver; obtaining / writing device drivers is an important aspect of embedded systems design.
- many kinds of peripherals appear as memory-mapped I/O from communication ports, displays, A/D and D/A converters, and general-purpose I/O

## Interrupt I/O

External interrupts allow the development of event-driven systems - when an event occurs, it triggers circuitry to send an interrupt signal to a microcontroller / microprocessor.

- the microcontroller stops whatever it is doing to service the interrupt service.
- the interrupt is mapped to a location in the computer's memory which contains the interrupt service routine which is dedicated to meeting the demands of the event.

Ex. many x86 (Intel systems) have multiple interrupt lines on a separate chip

The 8259 Programmable Interrupt Controller usually programmed in edge-triggered mode

| IRQ   | Line | Description               | Port |
|-------|------|---------------------------|------|
| 0     | -    | system timer              |      |
| 1     | -    | keyboard controller       |      |
| 3/4   | -    | serial port controllers   |      |
| 8     | -    | RTC (real-time clock)     |      |
| 10/11 | -    | left open for peripherals |      |
| 12    | -    | PS/2 mouse                |      |

→ the 8259 has been surpassed by the APIC (Advanced programmable interrupt controller) on newer Intel systems

→ under Linux, execute

\$ cat /proc/interrupts

to view the interrupt mapping on your PC

## Direct-Memory Access (DMA)

D<sub>irect</sub>

send receive

- a technology that allows an I/O device to send or receive data to or from the computer memory, without the intervention of the CPU.
- a process often managed by a DMA controller chip (n DMAc) <sup>controller</sup>
- the idea is to free up the CPU to handle other tasks while an I/O device functions; The DMAC issues an interrupt when data transfers are completed.
- there may be multiple DMA channels in a computer system, with data transfer rates dependent on the system specifications <sup>uses</sup> <sub>bytes are</sub>



Ex.

- therefore, second-LSB of the second memory word has the address

- $4 + 2 = 6$  in BE convention
- $4 + 1 = 5$  " LE

- the second-MSB of the memory word at address C has the address:

- $C + 2 = E$  in LE
- $C + 1 = D$  in BE

## PROGRAMMING

- while not as accessible as high-level programming there are ways to translate high-level language constructs into assembly language.

## DATA PROCESSING INSTRUCTIONS

### logical instructions



→ three operands : R<sub>dest</sub>, R<sub>source1</sub>, R<sub>source2/I<sub>2</sub></sub>  
 ↑                   ↑                   ↑  
 a register   a register   a register or immediate

Ex.

|     |     |      |     |
|-----|-----|------|-----|
| AND | R3, | R1,  | R2  |
| ORR | R4, | R1,  | R2  |
| EOR | R5, | R1,  | #73 |
| MVN | R7, | R2   |     |
| MVN | R7, | #537 |     |

### shift instructions

L SL  
L SR  
A SR  
R OR

not ROL  
since an RL  
can be performed  
by ROR using  
the complementary  
amount

(logical shift left) → LSBs are filled with zeros  
" " right) → MSBs " "  
{ " " shift right) → preserves sign bit  
(rotate right)

Ex.

|      |     |     |     |
|------|-----|-----|-----|
| L SL | R0, | R5, | #7  |
| L SR | R1, | R5, | #17 |
| A SR | R2, | R5, | #3  |
| L SL | R4, | R8, | R6  |
| R OR | R5, | R8, | R6  |

### multiply instructions

- 32 bit operands yield a product of up to 64 bits

$$\text{SEE: } (2^{32} \times 2^{32}) = 2^{64}$$

MUL

← discards the MSBs

$$\text{or } (2^{17} \times 2^{16}) = 2^{33}$$

produce  
64-bit  
results

U MULL  
S MULL

(unsigned multiply long)  
(signed " "

Ex.

UMULL R1, R2, R3, R4 ; R3 × R4 → R2 | R1

product is  
stored in two  
registers

MLA  
SMLAL  
UMLAL } multiply-accumulate variants, or DSP-like instructions  
(product is added to a running sum)

## CONDITION FLAGS

- we use a special register, the current program status register (CPSR)
- we can affect program flow using this register



CMP R4, R5  
ADD EQ R1, R2, R3

only executes if Z is set

CONDITION MNEMONICS → appended to regular instructions

| cond    | mnemonic | name                  | Cond EX                     |
|---------|----------|-----------------------|-----------------------------|
| 0 0 0 0 | EQ       | equal                 | $\frac{Z}{Z}$               |
| 0 0 0 1 | NE       | not equal             |                             |
| :       |          |                       |                             |
| 0 1 1 0 | VS       | overflow/overflow set | $\frac{V}{V}$               |
| 0 1 1 1 | VC       | no overflow           |                             |
| :       |          |                       |                             |
| 1 1 0 0 | GT       | signed greater than   | $\frac{Z(N\oplus V)}{Z}$    |
| 1 1 0 1 | LE       | " less than or equal  | $Z \text{ OR } (N\oplus V)$ |

→ see Table 6.3 (for full list) in Harris & Harris text.

## BRANCHING

→ branch instruction:

B THERE ;  
B EQ THERE1 ;  
B GT THERE2 ;  
B VS THERE3 ;

go to THERE

branch if equal  
" if (signed) greater

" if overflow

## CONDITIONAL STATEMENTS

### • if/else

#### HIGH LEVEL CODE:

```
if (X == Y)
    f = i + 1;
else
    f = f - i;
```

#### ARM ASSEMBLY CODE:

; R0 = X , R1 = Y , R2 = f , R3 = i  
 CMP R0, R1 ; if  $X == Y$  ?  
 BNE L1 ; if not equal "skip if block"  
 ADD R2, R3, #1 ; if block :  $f = f + 1$   
 B L2 ; skip else block

L1 SUB R2, R2, R3 ; else block :  $f = f - i$

L2

### • switch/case

#### HIGH-LEVEL CODE

|            |                    |
|------------|--------------------|
| switch (X) | $\Sigma$           |
| case 1 :   | $Y = 20$ ; break;  |
| case 2 :   | $Y = 50$ ; break;  |
| case 3 :   | $Y = 100$ ; break; |
| default :  | $Y = 0$ ;          |

#### ARM ASSEMBLY CODE

; R0 = X , R1 = Y  
 CMP R0, #1 ; is case X = 1 ?  
 MOVEQ R1, #20 ; Y = 20  
 BEQ DONE ; break  
 CMP R0, #2 ; is case X = 2 ?  
 MOVEQ R1, #50 ; Y = 50  
 BEQ DONE ; break

make assignment #2 due next week

Homework : 6.1 - 6.5

... cont'd

CMP R0, #3  
MOVEQ RI, #100  
BEQ DONE  
MOV RI, #0

DONE

i. is case  
Y = 100  
break  
default case, Y = 0

X = 3?

## LOOPS

- while loops

HLC:

```
int pow = 1;
int x = 0;
while (pow != 128) {
    pow = pow * 2;
    x++;
}
```

→ finds power of  
2 that yields  
128

AAL:

```
; R0 = pow, RI = x
MOV R0, #1 ; pow = 1
MOV RI, #0 ; x = 0
```

WHILE

```
CMP R0, #128 ; pow == 128?
BEQ DONE ; if so, exit
LSL R0, R0, #1 ; pow = pow * 2
ADD R1, R1, #1 ; x = x +
B WHILE ; unconditional branch to continue loop
```

DONE

- for loops

HLC: adds the numbers from 0 to 9

```
int i;
int sum = 0;
for (i=0; i<10; i=i+1) {
    sum = sum + i;
```

3

AAC:

; RD = i, RI = sum

MOV RI, #0 ; sum = 0  
MOV RD, #0 ; i = 0

loop initialization

FOR

CMP RD, #10 ; i < 10?  
BGE DONE ; check opposite condition, i ≥ 10, to exit  
ADD RI, RI, RD ; sum = sum + i  
ADD RD, RD, #1 ; increment index  
B FOR ; repeat (unconditional branch)

DONE

## MEMORY

- one of the simplest data structures  
- implemented as the array which  
is composed of memory words stored  
at consecutive addresses...  
e.g. consider an array scores [ ] with 200 elements  
memory address | MAIN MEMORY

|                 |            |
|-----------------|------------|
| 1 4 0 0 0 3 1 C | scores[99] |
| 1 4 0 0 0 3 1 B | scores[98] |
| :               | :          |
| 1 4 0 0 0 0 0 8 | scores[2]  |
| 1 4 0 0 0 0 0 4 | scores[1]  |
|                 | scores[1]  |

(17)

HLC:

```

int i;
int scores [200];
...
for (i=0; i<200; i=i+1)
    scores [i] = scores[i] + 10;

```

AAC:

; R0 = array base address , R1 = i

; initialization code ...

MOV R0, #0x14000000 ; R0 = base address  
 MOV R1, #0 ; i=0

LOOP

CMP R1, #200 ; i < 200 ?  
 BGE L3 ; if  $i \geq 200$ , exit  
 LSL R2, R1, #2 ; R2 =  $i \times 4$  (32-bit addressing)  
 LDR R3, [R0, R2] ; R3 = scores[i]  
 ADD R3, R3, #10 ; R3 = scores[i] + 10,  
 STR R3, [R0, R2] ; scores[i] = scores[i] + 10  
 ADD R1, R1, #1 ; i = i + 1  
 B LOOP ; repeat

L3

; program comes here when loop is complete

## ARM INDEXING MODES

mode

ARM ASSEMBLY

ADDRESS

BASE REGISTER  
(R1)

Offset

LDR R0, [R1, R2]

R1 + R2

unchanged

Pre-Index

LDR R0, [R1, R2]!

R1 + R2

R1 = R1 + R2

Post-Index

LDR R0, [R1], R2

R1

R1 = R1 + R2

- see example in text → it's possible to simplify program further  
 - read about ASCII codes

## Byte-level addressing



## FUNCTION CALLS

- ARM architecture uses the branch and link instruction (BL) to call a function and moves the link register to the program counter (PC) (MOV PC, LR) to return from a function

HLC:

```
int main() {
    simple();
    .
    .
}

void simple() {
    return; // void means no return value
}
```

AAC:

MAIN ... ; main code begins here

:

BL SIMPLE ; call the simple() function

SIMPLE

MOV PC, LR ; return

LR  $\leftrightarrow$  R14  
PC  $\leftrightarrow$  R15

recall more complex example:

HLC:

int main () {

int : yi

y = diff of sums (2,3,4,5);

}

int diff of sums (int f, int g, int h, int i) {

int result  
result = (f+g) - (h+i);  
return result;

}

AAC:

; R4 = y

MAIN :

MOV R0, #2 ; argument  
 MOV R1, #3 ;  
 MOV R2, #4 ;  
 MOV R3, #5 ;  
 BL DIFFOFSUMS ; call function  
 MOV R4, R0 ; y = returned value

; R4 = result

DIFFOFSUMS:

ADD R8, R0, R1 ; R8 = f + g  
 ADD R9, R2, R3 ; R9 = h + i  
 SUB R4, R8, R9 ; result = (f + g) - (h + i)  
 MOV R0, R4 ; put return value in R0  
 MOV PC, LR ; return to caller

# VHDL TUTORIAL — largely taken from Jan Van der Spiegel's on-line VHDL tutorial...

## 1) Introduction

- developed in mid-1980s to facilitate the design of very high speed digital integrated circuits
- created jointly by the IEEE and the U.S. DoD
- other widely used Hardware Description Language is Verilog (perhaps more widely used) — we'll also cover some Verilog
- VHDL stands for Very high speed IC Hardware Description Language
- although HDLs bear some resemblance to most conventional programming languages, there are differences, namely:
  - HDLs are inherently parallel
  - HDLs mimic the behaviour of a physical digital system
  - allows incorporation of timing specs and the ability to describe a system as an interconnection of components (like Simulink/Matlab if you have the experience with that)

Q: how do you feel about much of technology's roots in militarism?

## 2) Representation and Abstraction

- a recurring theme in digital design: levels of abstraction
  - allows us to manage complexity





→ with VHDL, both behavioural and structural descriptions of a system are possible

DATA FLOW → concurrent statements (parallel execution)

ALGORITHMIC → sequential processes/statements

### 3) VHDL File Structure

BASIC BUILDING BLOCK → VHDL ENTITY

- declaration:

```

entity NAME_OF_ENTITY is
  generic (generic_declarations);
  port ( signal_names_1 : mode_type;
         signal_names_2 : mode_type;
         ⋮
         signal_names_K : mode_type);
end [NAME_OF_ENTITY]; 
```

user-selected

optional

one of:  
in  
out  
buffer  
inout

one of several pre-defined signal types  
see next page

THINK OF YOUR VHDL FILE AS:



NOTE:

- VHDL:
  - has reserved key words that cannot be used as names for signals or entities;  
→ keywords and user-defined identifiers are case insensitive
  - employs comment lines by ---;
  - ignores line breaks and extra spaces
  - is a strongly typed language which employs type declarations of every signal, constant or variable

- regarding mode keywords:

in - indicates that the signal is an input

out - indicates that the signal is an output  
which can be read only by other entities

buffer - an output which can be read by the entity itself

inout - a signal which can be either an input or an output

- regarding signal types:

bit - can have the values 0 or 1

bit-vector - a vector of bit values

std-logic

std-logic

std-logic

std-logic-vector

std-logic-vector

can have 9 values corresponding to signal strength

→ preferred over corresponding bit and bit-vector types

boolean - can have the value, TRUE and FALSE  
 integer - can have a range of integer values  
 real - can have a range of real values  
 character - any printing character  
 time - to indicate time

- generic declarations are optional
  - provide local constants used for timing and sizing (bus widths) for the entity

- syntax:

```
generic (
  constant-name-1: type [:= value-1];
  constant-name-2: type [:= value-2];
  :
  constant-name-K: type [:= value-K]);
```

## Some Examples:



-- buzzer example  
entity BUZZER is

```
port (DOOR, IGNITION, SBELT : in std-logic;
      WARNING : out std-logic);
end BUZZER;
```



-- four-to-one multiplexor for 8-bit words  
entity mux4\_to\_1 is

```
port (I0, I1, I2, I3 : in std-logic-vector (7 downto 0);
      SEL : in std-logic-vector (1 downto 0);
      OUT1 : out std-logic-vector (7 downto 0));
end mux4-to-1;
```

■ -- D-flip-flop with set and reset  
entity dff\_sr is  
port (D, CLK, S, R : in std\_logic;  
Q, Qnot : out std\_logic);  
end dff\_sr;



## Architecture Body

- specifies how the circuit operates and how it is implemented
- syntax:

```
architecture architecture-name of name-of-entity is
-- declarations (components, signals, constants, etc)
begin
-- statements
:
end architecture-name;
```

- examples:

■ Entity AND2 is  
port (in1, in2 : in std\_logic;  
out1 : out std\_logic);  
end AND2;

architecture behavioural\_2 of AND2 is  
begin
out1 <= in1 and in2;
end behavioural\_2;

\* ■ for BUZZER example above:

architecture behavioural-buzzer of BUZZER is  
begin

    WARNING ( $\leq$ ) (not DOOR and IGNITION) or  
    (not SBELT and IGNITION);

end behavioural-buzzer;

→ assignment operator

another example:

entity AND2 is

```
port (in1, in2 : in std_logic;
      out1 : out std_logic);
```

end AND2;

architecture behavioural\_2 of AND2 is

begin

```
out1 <= in1 and in2;
```

end behavioural\_2;

entity XNOR2 is

```
port (A, B : in std_logic;
      Z : out std_logic);
```

end XNOR2;

architecture behavioural\_xnor of XNOR2 is

-- signal declaration (of internal signals X, Y)  
signal X, Y : std\_logic;

begin

```
X <= A and B;
Y <= (not A) and (not B);
Z <= X or Y;
```

end behavioural\_xnor;

logic operators: and, or, nand, nor, xor, xnor, not

## Concurrency

The statements in the Architecture Body

begin

```

X <= A and B;
Y <= (not A) and (not B);
Z <= X or Y;
end behavioral;

```

are concurrent statements  $\Rightarrow$  they are executed when one or more of the signals on the RHS change value (i.e., an "event" occurs on one of the signals).

\* There may be (must be?) a propagation delay associated with this change ~~cannot have algebraic loops!~~

## Structural Description

Here is a structural version of BUZZER:

architecture structural\_buzz of BUZZER is

```

-- declarations
component AND2
    port (in1, in2: in std_logic;
          out1: out std_logic);
end component;
component OR2
    port (in1, in2: in std_logic;
          out1: out std_logic);
end component;
component NOT1
    port (in1: in std_logic;
          out1: out std_logic);
end component;
-- signals for interconnect
signal DOOR_NOT, SBELT_NOT, B1, B2 : std_logic;
begin
    U0: NOT1 port map (DOOR, DOOR_NOT);
    U1: NOT1 port map (SBELT, SBELT_NOT);
    U2: AND2 port map ((IGNITION, DOOR_NOT), B1);
    U3: AND2 port map ((IGNITION, SBELT_NOT), B2);
    U4: OR2 port map (B1, B2, WARNING);
end structural;

```