

---

# **Lecture 16: Instruction Level Parallelism**

## **-- Dynamic Scheduling (OOO) via Tomasulo's Approach**

**CSCE 513 Computer Architecture**

**Department of Computer Science and Engineering**

**Yonghong Yan**

**[yanyh@cse.sc.edu](mailto:yanyh@cse.sc.edu)**

**<https://passlab.github.io/CSCE513>**

---

# Topics for Instruction Level Parallelism

---

- 5-stage Pipeline Extension, ILP Introduction, Compiler Techniques, and Branch Prediction
  - C.5, C.6
  - 3.1, 3.2
  - ~~Branch Prediction, C.2, 3.3~~
- **Dynamic Scheduling (OOO)**
  - 3.4, 3.5
- **Hardware Speculation and Static Superscalar/VLIW**
  - 3.6, 3.7
- **Dynamic Superscalar, Advanced Techniques, ARM Cortex-A53, and Intel Core i7**
  - 3.8, 3.9, 3.12
- **SMT: Exploiting Thread-Level Parallelism to Improve Uniprocessor Throughput**
  - 3.11

# Acknowledge and Copyright

---

- Slides adapted from
  - UC Berkeley course “Computer Science 252: Graduate Computer Architecture” of David E. Culler Copyright(C) 2005 UCB
  - UC Berkeley course Computer Science 252, Graduate Computer Architecture Spring 2012 of John Kubiatowicz Copyright(C) 2012 UCB
  - Computer Science 152: Computer Architecture and Engineering, Spring 2016 by Dr. George Michelogiannakis from UC Berkeley
- <https://passlab.github.io/CSCE513/copyrightack.html>

## 3.4 Overcoming Data Hazards With Dynamic Scheduling

---

- **Data Hazards**
- **Control Hazards**

# Types of Data Hazards

Consider executing a sequence of

$$r_k \leq r_i \text{ op } r_j$$

type of instructions

**Data-dependence**

$$\begin{array}{l} r_3 \leq r_1 \text{ op } r_2 \\ r_5 \leq r_3 \text{ op } r_4 \end{array}$$

**Read-after-Write  
(RAW) hazard**

**Anti-dependence**

$$\begin{array}{l} r_3 \leq r_1 \text{ op } r_2 \\ r_1 \leq r_4 \text{ op } r_5 \end{array}$$

**Write-after-Read  
(WAR) hazard**

**Output-dependence**

$$\begin{array}{l} r_3 \leq r_1 \text{ op } r_2 \\ r_3 \leq r_6 \text{ op } r_7 \end{array}$$

**Write-after-Write  
(WAW) hazard**

# Register vs. Memory Dependence

---

**Data hazards due to register operands can be determined at the decode stage, but data hazards due to memory operands can be determined only after computing the effective address**

**Store:**  $M[r1 + \text{disp1}] \leq r2$

**Load:**  $r3 \leq M[r4 + \text{disp2}]$

**Does  $(r1 + \text{disp1}) = (r4 + \text{disp2})$  ?**

# Data Hazards: An Example

|       |        |
|-------|--------|
| $I_1$ | FDIV.D |
| $I_2$ | FLD    |
| $I_3$ | FMUL.D |
| $I_4$ | FDIV.D |
| $I_5$ | FSUB.D |
| $I_6$ | FADD.D |



**RAW Hazards**  
**WAR Hazards**  
**WAW Hazards**

# Instruction Scheduling



*Valid orderings:*

*in-order*       $I_1$        $I_2$        $I_3$        $I_4$        $I_5$

*out-of-order*       $I_2$        $I_1$        $I_3$        $I_4$        $I_5$

*out-of-order*       $I_1$        $I_2$        $I_3$        $I_5$        $I_4$



# Out-of-order Completion

## *In-order Issue*

|                          |         |                                                     |        |     |         | <i>Latency</i> |
|--------------------------|---------|-----------------------------------------------------|--------|-----|---------|----------------|
| $I_1$                    | FDIV.D  | f6,                                                 | f6,    | f4  |         | 4              |
| $I_2$                    | FLD     | f2,                                                 | 45(x3) |     |         | 1              |
| $I_3$                    | FMULT.D | f0,                                                 | f2,    | f4  |         | 3              |
| $I_4$                    | FDIV.D  | f8,                                                 | f6,    | f2  |         | 4              |
| $I_5$                    | FSUB.D  | f10,                                                | f0,    | f6  |         | 1              |
| $I_6$                    | FADD.D  | f6,                                                 | f8,    | f2  |         | 1              |
| <i>in-order comp</i>     |         | 1 2                                                 | 1 2    | 3 4 | 3 5 4 6 | 5 6            |
| <i>out-of-order comp</i> |         | 1 2 <u>2</u> 3 <u>1</u> 4 3 5 <u>5</u> 4 6 <u>6</u> |        |     |         |                |

Underlines are completes

# Register Renaming to Eliminate WAR and WAW Hazards

---

- Example:

|       |            |                                  |
|-------|------------|----------------------------------|
| DIV.D | F0,F2,F4   | Anti-dependence (WAR) on F8      |
| ADD.D | F6,F0,F8   | Also on F6 between S.D and MUL.D |
| S.D   | F6,0(R1)   |                                  |
| SUB.D | F8,F10,F14 | Output dependence (WAW) on F6    |
| MUL.D | F6,F10,F8  |                                  |

+ name dependence with F6 and F8

*No real data sharing between two instructions.*

# Register Renaming

---

- Example:

DIV.D F0,F2,F4

ADD.D F6,F0,F8

S.D F6,0(R1)

SUB.D T2,F10,F14

MUL.D T1,F10,T2



DIV.D F0,F2,F4

ADD.D F6,F0,F8

S.D F6,0(R1)

SUB.D F8,F10,F14

MUL.D F6,F10,F8

- Now only RAW hazards remain, which can be strictly ordered

# Dynamic Scheduling

---

- Rearrange order of instructions to reduce stalls while maintaining data flow
  - *Minimize impacts by RAW Hazards*
  - *Minimize WAW and WAR hazards via Register Renaming*
  - Between registers and memory hazards
- Advantages:
  - Compiler doesn't need to have knowledge of microarchitecture
  - Handles cases where dependencies are unknown at compile time
- Disadvantages:
  - Substantial increase in hardware complexity
  - Complicates exceptions

# Dynamic Scheduling

---

- **Dynamic scheduling implies:**
  - **Out-of-order execution**
  - **Out-of-order completion**
- **May create more WAR and WAW hazards**
  - **Since execution are out-of-order**
- **Scoreboard: C.6**
  - **CDC6600 in 1963**
- **Tomasulo's Approach**
  - **Tracks when operands are available**
  - **Introduces register renaming in hardware**
    - » **Minimizes WAW and WAR hazards**

# Scoreboard Architecture (CDC 6600)



# Tomasulo Algorithm

---

- For IBM 360/91 about 3 years after CDC 6600 (1966)
- Goal: High Performance without special compilers
- Differences between IBM 360 & CDC 6600 ISA
  - IBM has only 2 register specifiers/instr vs. 3 in CDC 6600
  - IBM has 4 FP registers vs. 8 in CDC 6600
  - IBM has memory-register ops
- Why Study? lead to Alpha 21264, HP 8000, MIPS 10000, Pentium II, PowerPC 604, ..., Today's processors

# Organizations of Tomasulo's Algorithm

- **Load/Store buffer**
- **Reservation station**
- **Common data bus**



# Tomasulo Algorithm vs. Scoreboard

---

- Control & buffers **distributed** with Function Units (FU) vs. centralized in scoreboard;
  - FU buffers called “**reservation stations**”; have pending operands
- Registers in instructions replaced by values or pointers to reservation stations(RS); → **register renaming** ;
  - Avoids WAR, WAW hazards
  - More reservation stations than registers, so can do optimizations compilers can't
- Results to FU from RS, **not through registers**, over **Common Data Bus** that broadcasts results to all FUs
- Load and Stores treated as FUs with RSs as well
- Integer instructions can go past branches, allowing FP ops beyond basic block in FP queue

# Register Renaming

- **Register renaming by reservation stations (RS)**

- Each entry contains:
  - » The instruction
  - » Buffered operand values (when available)
  - » Reservation station number of instruction providing the operand values
- RS fetches and buffers an operand as soon as it becomes available (not necessarily involving register file)
- Pending instructions designate the RS to which they will send their output
  - » Result values broadcast on the common data bus (CDB)
- Only the last output updates the register file
- As instructions are issued, the register specifiers are renamed with the reservation station
- May be more reservation stations than registers



# Reservation Station Components

---

**Op:** Operation to perform in the unit (e.g., + or -)

**V<sub>j</sub>, V<sub>k</sub>:** **Value** of Source operands

- Store buffers has V field, result to be stored

**Q<sub>j</sub>, Q<sub>k</sub>:** Reservation stations producing source registers (value to be written)

- Note: No ready flags as in Scoreboard; Q<sub>j</sub>, Q<sub>k</sub>=0 => ready in V<sub>j</sub> or V<sub>k</sub>
- Store buffers only have Q<sub>i</sub> for RS producing result

**Busy:** Indicates reservation station or FU is busy

**Q<sub>i</sub>:** **Register result status**—Indicates which functional unit will write each register, if one exists. Blank when no pending instructions that will write that register.

# Three Stages of Tomasulo Algorithm

---

## 1. Issue—get instruction from FP Op Queue

If reservation station free (no structural hazard),  
control issues instr & sends operands (renames registers).

## 2. Execution—operate on operands (EX)

When both operands ready then execute;  
if not ready, watch Common Data Bus for result

## 3. Write result—finish execution (WB)

Write on Common Data Bus to all awaiting units;  
mark reservation station available

- Normal data bus: data + destination (“go to” bus)
- Common data bus: data + source (“come from” bus)
  - 64 bits of data + 4 bits of Functional Unit source address
  - Write if matches expected Functional Unit (produces result)
  - Does the broadcast

# Tomasulo Organization for the Example



# Tomasulo Example

*Instruction status:*

| Instruction | j   | k   | Exec Write |      |        |
|-------------|-----|-----|------------|------|--------|
|             |     |     | Issue      | Comp | Result |
| LD          | F6  | 34+ | R2         |      |        |
| LD          | F2  | 45+ | R3         |      |        |
| MULTD       | F0  | F2  | F4         |      |        |
| SUBD        | F8  | F6  | F2         |      |        |
| DIVD        | F10 | F0  | F6         |      |        |
| ADDD        | F6  | F8  | F2         |      |        |

|       | Busy | Address |
|-------|------|---------|
| Load1 | No   |         |
| Load2 | No   |         |
| Load3 | No   |         |

*Reservation Stations:*

| Time | Name  | Busy | S1 |    | RS |    |    |
|------|-------|------|----|----|----|----|----|
|      |       |      | Op | Vj | Vk | Qj | Qk |
|      | Add1  | No   |    |    |    |    |    |
|      | Add2  | No   |    |    |    |    |    |
|      | Add3  | No   |    |    |    |    |    |
|      | Mult1 | No   |    |    |    |    |    |
|      | Mult2 | No   |    |    |    |    |    |



*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|----|----|----|----|-----|-----|-----|-----|
| 0     | FU |    |    |    |    |     |     |     |     |

# Tomasulo Example Cycle 1

*Instruction status:*

| Instruction |     | <i>j</i> | <i>k</i> | <i>Issue</i> | <i>Exec</i> | <i>Write</i>  |
|-------------|-----|----------|----------|--------------|-------------|---------------|
|             |     |          |          |              | <i>Comp</i> | <i>Result</i> |
| LD          | F6  | 34+      | R2       | 1            |             |               |
| LD          | F2  | 45+      | R3       |              |             |               |
| MULTD       | F0  | F2       | F4       |              |             |               |
| SUBD        | F8  | F6       | F2       |              |             |               |
| DIVD        | F10 | F0       | F6       |              |             |               |
| ADDD        | F6  | F8       | F2       |              |             |               |

|       | <i>Busy</i> | <i>Address</i> |
|-------|-------------|----------------|
| Load1 | Yes         | 34+R2          |
| Load2 | No          |                |
| Load3 | No          |                |

*Reservation Stations:*

| <i>Time</i> | <i>Name</i> | <i>Busy</i> | <i>Op</i> | <i>S1</i>            | <i>S2</i>            | <i>RS</i>            | <i>RS</i>            |
|-------------|-------------|-------------|-----------|----------------------|----------------------|----------------------|----------------------|
|             |             |             |           | <i>V<sub>j</sub></i> | <i>V<sub>k</sub></i> | <i>Q<sub>j</sub></i> | <i>Q<sub>k</sub></i> |
|             | Add1        | No          |           |                      |                      |                      |                      |
|             | Add2        | No          |           |                      |                      |                      |                      |
|             | Add3        | No          |           |                      |                      |                      |                      |
|             | Mult1       | No          |           |                      |                      |                      |                      |
|             | Mult2       | No          |           |                      |                      |                      |                      |



*Register result status:*

| Clock | <i>F0</i> | <i>F2</i> | <i>F4</i> | <i>F6</i> | <i>F8</i> | <i>F10</i> | <i>F12</i> | ... | <i>F30</i> |
|-------|-----------|-----------|-----------|-----------|-----------|------------|------------|-----|------------|
| 1     |           |           |           |           | Load1     |            |            |     |            |

# Tomasulo Example Cycle 2

## Instruction status:

| Instruction | j   | k   | Exec Write |      |        |
|-------------|-----|-----|------------|------|--------|
|             |     |     | Issue      | Comp | Result |
| LD          | F6  | 34+ | R2         | 1    |        |
| LD          | F2  | 45+ | R3         | 2    |        |
| MULTD       | F0  | F2  | F4         |      |        |
| SUBD        | F8  | F6  | F2         |      |        |
| DIVD        | F10 | F0  | F6         |      |        |
| ADDD        | F6  | F8  | F2         |      |        |

|       | Busy | Address |
|-------|------|---------|
| Load1 | Yes  | 34+R2   |
| Load2 | Yes  | 45+R3   |
| Load3 | No   |         |

## Reservation Stations:

| Time | Name  | Busy | S1 |    | RS |    |    |
|------|-------|------|----|----|----|----|----|
|      |       |      | Op | Vj | Vk | Qj | Qk |
|      | Add1  | No   |    |    |    |    |    |
|      | Add2  | No   |    |    |    |    |    |
|      | Add3  | No   |    |    |    |    |    |
|      | Mult1 | No   |    |    |    |    |    |
|      | Mult2 | No   |    |    |    |    |    |



## Register result status:

| Clock | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|-------|----|-------|----|-----|-----|-----|-----|
| 2     |    | Load2 |    | Load1 |    |     |     |     |     |

Allow multiple outstanding loads

# Tomasulo Example Cycle 3

## Instruction status:

| Instruction | j   | k   | Exec  |      |        | Write |           |
|-------------|-----|-----|-------|------|--------|-------|-----------|
|             |     |     | Issue | Comp | Result | Busy  | Address   |
| LD          | F6  | 34+ | R2    | 1    | 3      | Load1 | Yes 34+R2 |
| LD          | F2  | 45+ | R3    | 2    |        | Load2 | Yes 45+R3 |
| MULTD       | F0  | F2  | F4    | 3    |        | Load3 | No        |
| SUBD        | F8  | F6  | F2    |      |        |       |           |
| DIVD        | F10 | F0  | F6    |      |        |       |           |
| ADDD        | F6  | F8  | F2    |      |        |       |           |

## Reservation Stations:

| Time | Name  | Busy | S1    |    | RS    |       |    |
|------|-------|------|-------|----|-------|-------|----|
|      |       |      | Op    | Vj | Vk    | Qj    | Qk |
|      | Add1  | No   |       |    |       |       |    |
|      | Add2  | No   |       |    |       |       |    |
|      | Add3  | No   |       |    |       |       |    |
|      | Mult1 | Yes  | MULTD |    | R(F4) | Load2 |    |
|      | Mult2 | No   |       |    |       |       |    |



## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8    | F10 | F12 | ... | F30 |
|-------|----|-------|-------|----|-------|-----|-----|-----|-----|
| 3     | FU | Mult1 | Load2 |    | Load1 |     |     |     |     |

- Note: registers names are removed ("renamed") in Reservation Stations; MULT issued vs. scoreboard
- Load1 completing; what is waiting for Load1?

# Tomasulo Example Cycle 4

## Instruction status:

| Instruction | j   | k   | Exec  |      |        | Write |
|-------------|-----|-----|-------|------|--------|-------|
|             |     |     | Issue | Comp | Result |       |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     |
| LD          | F2  | 45+ | R3    | 2    | 4      |       |
| MULTD       | F0  | F2  | F4    | 3    |        |       |
| SUBD        | F8  | F6  | F2    | 4    |        |       |
| DIVD        | F10 | F0  | F6    |      |        |       |
| ADDD        | F6  | F8  | F2    |      |        |       |

|       | Busy | Address |
|-------|------|---------|
| Load1 | No   |         |
| Load2 | Yes  | 45+R3   |
| Load3 | No   |         |

Waiting for data from memory by the instruction originally in Load1

## Reservation Stations:

| Time | Name  | Busy | Op    | $V_j$ | $V_k$ | $Q_j$ | $Q_k$ |
|------|-------|------|-------|-------|-------|-------|-------|
|      | Add1  | Yes  | SUBD  | M(A1) |       |       | Load2 |
|      | Add2  | No   |       |       |       |       | ←     |
|      | Add3  | No   |       |       |       |       |       |
|      | Mult1 | Yes  | MULTD |       | R(F4) | Load2 |       |
|      | Mult2 | No   |       |       |       |       |       |

## Register result status:

| Clock | F0 | F2    | F4    | F6    | F8   | F10 | F12 | ... | F30 |
|-------|----|-------|-------|-------|------|-----|-----|-----|-----|
| 4     | FU | Mult1 | Load2 | M(A1) | Add1 |     |     |     |     |

- Load2 completing; what is waiting for Load2?

# Tomasulo Example Cycle 5

## Instruction status:

| Instruction | j   | k   | Issue | Exec | Write  | Busy | Address |
|-------------|-----|-----|-------|------|--------|------|---------|
|             |     |     |       | Comp | Result |      |         |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4    | Load1   |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5    | Load2   |
| MULTD       | F0  | F2  | F4    | 3    |        |      | Load3   |
| SUBD        | F8  | F6  | F2    | 4    |        |      |         |
| DIVD        | F10 | F0  | F6    | 5    |        |      |         |
| ADDD        | F6  | F8  | F2    |      |        |      |         |

## Reservation Stations:

| Time | Name  | Busy | Op    | S1             | S2             | RS             | RS             |
|------|-------|------|-------|----------------|----------------|----------------|----------------|
|      |       |      |       | V <sub>j</sub> | V <sub>k</sub> | Q <sub>j</sub> | Q <sub>k</sub> |
| 2    | Add1  | Yes  | SUBD  | M(A1)          | M(A2)          |                |                |
|      | Add2  | No   |       |                |                |                |                |
| 10   | Add3  | No   |       |                |                |                |                |
|      | Mult1 | Yes  | MULTD | M(A2)          | R(F4)          |                |                |
|      | Mult2 | Yes  | DIVD  |                |                | M(A1)          | Mult1          |

Waiting for data from memory by the instruction originally in Load2

## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8    | F10  | F12   | ... | F30 |
|-------|----|-------|-------|----|-------|------|-------|-----|-----|
| 5     | FU | Mult1 | M(A2) |    | M(A1) | Add1 | Mult2 |     |     |

# Tomasulo Example Cycle 6

## Instruction status:

| Instruction | j   | k   | Exec Write |      |        |
|-------------|-----|-----|------------|------|--------|
|             |     |     | Issue      | Comp | Result |
| LD          | F6  | 34+ | R2         | 1    | 3      |
| LD          | F2  | 45+ | R3         | 2    | 4      |
| MULTD       | F0  | F2  | F4         | 3    |        |
| SUBD        | F8  | F6  | F2         | 4    |        |
| DIVD        | F10 | F0  | F6         | 5    |        |
| ADDD        | F6  | F8  | F2         | 6    |        |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

## Reservation Stations:

| Time | Name  | Busy | S1    |       | S2    |       | RS   |    | RS |    |
|------|-------|------|-------|-------|-------|-------|------|----|----|----|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk   | Qj | Qk | Qj |
| 1    | Add1  | Yes  | SUBD  | M(A1) | M(A2) |       |      |    |    |    |
|      | Add2  | Yes  | ADDD  |       |       | M(A2) | Add1 |    |    |    |
|      | Add3  | No   |       |       |       |       |      |    |    |    |
| 9    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |      |    |    |    |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |      |    |    |    |



## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8   | F10  | F12   | ... | F30 |
|-------|----|-------|-------|----|------|------|-------|-----|-----|
| 6     | FU | Mult1 | M(A2) |    | Add2 | Add1 | Mult2 |     |     |

- Issue ADDD here vs. scoreboard?

# Tomasulo Example Cycle 7

## Instruction status:

| Instruction | j   | k   | Exec Write |      |        |
|-------------|-----|-----|------------|------|--------|
|             |     |     | Issue      | Comp | Result |
| LD          | F6  | 34+ | R2         | 1    | 3      |
| LD          | F2  | 45+ | R3         | 2    | 4      |
| MULTD       | F0  | F2  | F4         | 3    |        |
| SUBD        | F8  | F6  | F2         | 4    | 7      |
| DIVD        | F10 | F0  | F6         | 5    |        |
| ADDD        | F6  | F8  | F2         | 6    |        |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

## Reservation Stations:

| Time | Name  | Busy | S1    |       | S2    |       | RS   |    | RS |    |
|------|-------|------|-------|-------|-------|-------|------|----|----|----|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk   | Qj | Qk | Qj |
| 0    | Add1  | Yes  | SUBD  | M(A1) | M(A2) |       |      |    |    |    |
|      | Add2  | Yes  | ADDD  |       |       | M(A2) | Add1 |    |    |    |
|      | Add3  | No   |       |       |       |       |      |    |    |    |
| 8    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |      |    |    |    |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |      |    |    |    |



## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8   | F10  | F12   | ... | F30 |
|-------|----|-------|-------|----|------|------|-------|-----|-----|
| 7     | FU | Mult1 | M(A2) |    | Add2 | Add1 | Mult2 |     |     |

- Add1 completing; what is waiting for it?

# Tomasulo Example Cycle 8

## Instruction status:

| Instruction | j   | k   | Exec  |      |        | Write |
|-------------|-----|-----|-------|------|--------|-------|
|             |     |     | Issue | Comp | Result |       |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5     |
| MULTD       | F0  | F2  | F4    | 3    |        |       |
| SUBD        | F8  | F6  | F2    | 4    | 7      | 8     |
| DIVD        | F10 | F0  | F6    | 5    |        |       |
| ADDD        | F6  | F8  | F2    | 6    |        |       |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

## Reservation Stations:

| Time | Name  | Busy | S1    |       | RS    |       |    |
|------|-------|------|-------|-------|-------|-------|----|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk |
|      | Add1  | No   |       |       |       |       |    |
| 2    | Add2  | Yes  | ADDD  | (M-M) | M(A2) |       |    |
|      | Add3  | No   |       |       |       |       |    |
| 7    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |    |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |    |



## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8   | F10   | F12   | ... | F30 |
|-------|----|-------|-------|----|------|-------|-------|-----|-----|
| 8     | FU | Mult1 | M(A2) |    | Add2 | (M-M) | Mult2 |     |     |

# Tomasulo Example Cycle 9

## Instruction status:

| Instruction | j   | k   | Exec  |      |        | Write |
|-------------|-----|-----|-------|------|--------|-------|
|             |     |     | Issue | Comp | Result |       |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5     |
| MULTD       | F0  | F2  | F4    | 3    |        |       |
| SUBD        | F8  | F6  | F2    | 4    | 7      | 8     |
| DIVD        | F10 | F0  | F6    | 5    |        |       |
| ADDD        | F6  | F8  | F2    | 6    |        |       |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

## Reservation Stations:

| Time | Name  | Busy | S1    |       | S2    |       | RS |  | RS |  |
|------|-------|------|-------|-------|-------|-------|----|--|----|--|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk |  |    |  |
|      | Add1  | No   |       |       |       |       |    |  |    |  |
| 1    | Add2  | Yes  | ADDD  | (M-M) | M(A2) |       |    |  |    |  |
|      | Add3  | No   |       |       |       |       |    |  |    |  |
| 6    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |    |  |    |  |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |    |  |    |  |



## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8   | F10   | F12   | ... | F30 |
|-------|----|-------|-------|----|------|-------|-------|-----|-----|
| 9     | FU | Mult1 | M(A2) |    | Add2 | (M-M) | Mult2 |     |     |

# Tomasulo Example Cycle 10

## Instruction status:

| Instruction | j   | k   | Exec Write |      |        |
|-------------|-----|-----|------------|------|--------|
|             |     |     | Issue      | Comp | Result |
| LD          | F6  | 34+ | R2         | 1    | 3      |
| LD          | F2  | 45+ | R3         | 2    | 4      |
| MULTD       | F0  | F2  | F4         | 3    |        |
| SUBD        | F8  | F6  | F2         | 4    | 7      |
| DIVD        | F10 | F0  | F6         | 5    |        |
| ADDD        | F6  | F8  | F2         | 6    | 10     |

|       | Busy | Address |
|-------|------|---------|
| Load1 | No   |         |
| Load2 | No   |         |
| Load3 | No   |         |

## Reservation Stations:

| Time | Name  | Busy | S1    |       | RS    |       |    |
|------|-------|------|-------|-------|-------|-------|----|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk |
|      | Add1  | No   |       |       |       |       |    |
| 0    | Add2  | Yes  | ADDD  | (M-M) | M(A2) |       |    |
|      | Add3  | No   |       |       |       |       |    |
| 5    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |    |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |    |



## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8   | F10   | F12   | ... | F30 |
|-------|----|-------|-------|----|------|-------|-------|-----|-----|
| 10    | FU | Mult1 | M(A2) |    | Add2 | (M-M) | Mult2 |     |     |

- Add2 completing; what is waiting for it?

# Tomasulo Example Cycle 11

## Instruction status:

| Instruction | j   | k   | Issue | Exec | Write  |
|-------------|-----|-----|-------|------|--------|
|             |     |     |       | Comp | Result |
| LD          | F6  | 34+ | R2    | 1    | 3      |
| LD          | F2  | 45+ | R3    | 2    | 4      |
| MULTD       | F0  | F2  | F4    | 3    |        |
| SUBD        | F8  | F6  | F2    | 4    | 7      |
| DIVD        | F10 | F0  | F6    | 5    |        |
| ADDD        | F6  | F8  | F2    | 6    | 10     |
|             |     |     |       |      | 11     |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

## Reservation Stations:

| Time | Name  | Busy | Op          | S1    | S2    | RS    | RS |
|------|-------|------|-------------|-------|-------|-------|----|
|      |       |      |             | Vj    | Vk    | Qj    | Qk |
|      | Add1  | No   |             |       |       |       |    |
|      | Add2  | No   |             |       |       |       |    |
|      | Add3  | No   |             |       |       |       |    |
| 4    | Mult1 | Yes  | MULTD M(A2) | R(F4) |       |       |    |
|      | Mult2 | Yes  | DIVD        |       | M(A1) | Mult1 |    |



## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8      | F10   | F12   | ... | F30 |
|-------|----|-------|-------|----|---------|-------|-------|-----|-----|
| 11    | FU | Mult1 | M(A2) |    | (M-M+N) | (M-M) | Mult2 |     |     |

- Write result of ADDD here vs. scoreboard?
- All quick instructions complete in this cycle!

# Tomasulo Example Cycle 12

## Instruction status:

| Instruction | j   | k   | Exec  |      |        | Write |
|-------------|-----|-----|-------|------|--------|-------|
|             |     |     | Issue | Comp | Result |       |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5     |
| MULTD       | F0  | F2  | F4    | 3    |        |       |
| SUBD        | F8  | F6  | F2    | 4    | 7      | 8     |
| DIVD        | F10 | F0  | F6    | 5    |        |       |
| ADDD        | F6  | F8  | F2    | 6    | 10     | 11    |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

## Reservation Stations:

| Time | Name  | Busy | S1    |       | RS    |       |    |
|------|-------|------|-------|-------|-------|-------|----|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk |
|      | Add1  | No   |       |       |       |       |    |
|      | Add2  | No   |       |       |       |       |    |
|      | Add3  | No   |       |       |       |       |    |
| 3    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |    |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |    |



## Register result status:

| Clock | F0 | F2    | F4    | F6 | F8           | F10   | F12 | ... | F30 |
|-------|----|-------|-------|----|--------------|-------|-----|-----|-----|
| 12    | FU | Mult1 | M(A2) |    | (M-M+M)(M-M) | Mult2 |     |     |     |

# Tomasulo Example Cycle 13

*Instruction status:*

| Instruction | j   | k   | Exec  |      |        | Write |         |    |
|-------------|-----|-----|-------|------|--------|-------|---------|----|
|             |     |     | Issue | Comp | Result | Busy  | Address |    |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     | Load1   | No |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5     | Load2   | No |
| MULTD       | F0  | F2  | F4    | 3    |        |       | Load3   | No |
| SUBD        | F8  | F6  | F2    | 4    | 7      | 8     |         |    |
| DIVD        | F10 | F0  | F6    | 5    |        |       |         |    |
| ADDD        | F6  | F8  | F2    | 6    | 10     | 11    |         |    |

*Reservation Stations:*

| Time | Name  | Busy | S1    |       | S2    |       | RS |       | RS |  |
|------|-------|------|-------|-------|-------|-------|----|-------|----|--|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk |       |    |  |
|      | Add1  | No   |       |       |       |       |    |       |    |  |
|      | Add2  | No   |       |       |       |       |    |       |    |  |
|      | Add3  | No   |       |       |       |       |    |       |    |  |
| 2    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |    | Mult1 |    |  |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |    |       |    |  |



*Register result status:*

| Clock | F0 | F2    | F4    | F6 | F8           | F10   | F12 | ... | F30 |
|-------|----|-------|-------|----|--------------|-------|-----|-----|-----|
| 13    | FU | Mult1 | M(A2) |    | (M-M+M)(M-M) | Mult2 |     |     |     |

# Tomasulo Example Cycle 14

*Instruction status:*

| Instruction | j   | k   | Exec  |      |        | Write |
|-------------|-----|-----|-------|------|--------|-------|
|             |     |     | Issue | Comp | Result |       |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5     |
| MULTD       | F0  | F2  | F4    | 3    |        |       |
| SUBD        | F8  | F6  | F2    | 4    | 7      | 8     |
| DIVD        | F10 | F0  | F6    | 5    |        |       |
| ADDD        | F6  | F8  | F2    | 6    | 10     | 11    |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

*Reservation Stations:*

| Time | Name  | Busy | S1    |       | RS    |       |    |
|------|-------|------|-------|-------|-------|-------|----|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk |
|      | Add1  | No   |       |       |       |       |    |
|      | Add2  | No   |       |       |       |       |    |
|      | Add3  | No   |       |       |       |       |    |
| 1    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |    |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |    |



*Register result status:*

| Clock | F0 | F2    | F4    | F6 | F8           | F10   | F12 | ... | F30 |
|-------|----|-------|-------|----|--------------|-------|-----|-----|-----|
| 14    | FU | Mult1 | M(A2) |    | (M-M+M)(M-M) | Mult2 |     |     |     |

# Tomasulo Example Cycle 15

*Instruction status:*

| Instruction | j   | k   | Exec  |      |        | Write |
|-------------|-----|-----|-------|------|--------|-------|
|             |     |     | Issue | Comp | Result |       |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5     |
| MULTD       | F0  | F2  | F4    | 3    | 15     |       |
| SUBD        | F8  | F6  | F2    | 4    | 7      | 8     |
| DIVD        | F10 | F0  | F6    | 5    |        |       |
| ADDD        | F6  | F8  | F2    | 6    | 10     | 11    |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

*Reservation Stations:*

| Time | Name  | Busy | S1    |       | RS    |       |    |
|------|-------|------|-------|-------|-------|-------|----|
|      |       |      | Op    | Vj    | Vk    | Qj    | Qk |
|      | Add1  | No   |       |       |       |       |    |
|      | Add2  | No   |       |       |       |       |    |
|      | Add3  | No   |       |       |       |       |    |
| 0    | Mult1 | Yes  | MULTD | M(A2) | R(F4) |       |    |
|      | Mult2 | Yes  | DIVD  |       | M(A1) | Mult1 |    |



*Register result status:*

| Clock | F0 | F2    | F4    | F6 | F8           | F10   | F12 | ... | F30 |
|-------|----|-------|-------|----|--------------|-------|-----|-----|-----|
| 15    | FU | Mult1 | M(A2) |    | (M-M+M)(M-M) | Mult2 |     |     |     |

# Tomasulo Example Cycle 16

*Instruction status:*

| Instruction | j   | k   | Exec  |      |        | Write |
|-------------|-----|-----|-------|------|--------|-------|
|             |     |     | Issue | Comp | Result |       |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5     |
| MULTD       | F0  | F2  | F4    | 3    | 15     | 16    |
| SUBD        | F8  | F6  | F2    | 4    | 7      | 8     |
| DIVD        | F10 | F0  | F6    | 5    |        |       |
| ADDD        | F6  | F8  | F2    | 6    | 10     | 11    |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

*Reservation Stations:*

| Time | Name  | Busy | S1   |      | RS    |    |    |
|------|-------|------|------|------|-------|----|----|
|      |       |      | Op   | Vj   | Vk    | Qj | Qk |
|      | Add1  | No   |      |      |       |    |    |
|      | Add2  | No   |      |      |       |    |    |
|      | Add3  | No   |      |      |       |    |    |
|      | Mult1 | No   |      |      |       |    |    |
| 40   | Mult2 | Yes  | DIVD | M*F4 | M(A1) |    |    |



*Register result status:*

| Clock | F0 | F2   | F4    | F6           | F8    | F10 | F12 | ... | F30 |
|-------|----|------|-------|--------------|-------|-----|-----|-----|-----|
| 16    | FU | M*F4 | M(A2) | (M-M+M)(M-M) | Mult2 |     |     |     |     |

**Faster than light computation  
(skip a couple of cycles)**

---

# Tomasulo Example Cycle 55

### *Instruction status:*

| Instruction | <i>j</i> | <i>k</i> | <i>Issue</i> | <i>Comp</i> | <i>Result</i> |
|-------------|----------|----------|--------------|-------------|---------------|
| LD          | F6       | 34+      | R2           | 1           | 3             |
| LD          | F2       | 45+      | R3           | 2           | 4             |
| MULTD       | F0       | F2       | F4           | 3           | 15            |
| SUBD        | F8       | F6       | F2           | 4           | 7             |
| DIVD        | F10      | F0       | F6           | 5           | 8             |
| ADDD        | F6       | F8       | F2           | 6           | 10            |
|             |          |          |              |             | 11            |

## *Reservation Stations:*

| on Stations: |             |             | <i>S1</i> | <i>S2</i> | <i>RS</i> | <i>RS</i> |           |
|--------------|-------------|-------------|-----------|-----------|-----------|-----------|-----------|
| <i>Time</i>  | <i>Name</i> | <i>Busy</i> | <i>Op</i> | <i>Vj</i> | <i>Vk</i> | <i>Qj</i> | <i>Qk</i> |
|              | Add1        | No          |           |           |           |           |           |
|              | Add2        | No          |           |           |           |           |           |
|              | Add3        | No          |           |           |           |           |           |
|              | Mult1       | No          |           |           |           |           |           |
| 1            | Mult2       | Yes         | DIVD      | M*F4      | M(A1)     |           |           |



### *Register result status:*

| Clock     |           | <i>F0</i>   | <i>F2</i>    | <i>F4</i> | <i>F6</i>      | <i>F8</i>    | <i>F10</i>   | <i>F12</i> | ... | <i>F30</i> |
|-----------|-----------|-------------|--------------|-----------|----------------|--------------|--------------|------------|-----|------------|
| <b>55</b> | <i>FU</i> | <b>M*F4</b> | <b>M(A2)</b> |           | <b>(M-M+N)</b> | <b>(M-M)</b> | <b>Mult2</b> |            |     |            |

# Tomasulo Example Cycle 56

## Instruction status:

| Instruction | j   | k   | Issue | Exec | Write  |
|-------------|-----|-----|-------|------|--------|
|             |     |     |       | Comp | Result |
| LD          | F6  | 34+ | R2    | 1    | 3      |
| LD          | F2  | 45+ | R3    | 2    | 4      |
| MULTD       | F0  | F2  | F4    | 3    | 15     |
| SUBD        | F8  | F6  | F2    | 4    | 7      |
| DIVD        | F10 | F0  | F6    | 5    | 56     |
| ADDD        | F6  | F8  | F2    | 6    | 10     |
|             |     |     |       |      | 11     |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

## Reservation Stations:

| Time | Name  | Busy | Op   | S1   | S2    | RS | RS |
|------|-------|------|------|------|-------|----|----|
|      |       |      |      | Vj   | Vk    | Qj | Qk |
|      | Add1  | No   |      |      |       |    |    |
|      | Add2  | No   |      |      |       |    |    |
|      | Add3  | No   |      |      |       |    |    |
|      | Mult1 | No   |      |      |       |    |    |
| 0    | Mult2 | Yes  | DIVD | M*F4 | M(A1) |    |    |



## Register result status:

| Clock | F0 | F2   | F4    | F6      | F8    | F10   | F12 | ... | F30 |
|-------|----|------|-------|---------|-------|-------|-----|-----|-----|
| 56    | FU | M*F4 | M(A2) | (M-M+M) | (M-M) | Mult2 |     |     |     |

- Mult2 is completing; what is waiting for it?

# Tomasulo Example Cycle 57

## Instruction status:

| Instruction | j   | k   | Exec  |      |        | Write |
|-------------|-----|-----|-------|------|--------|-------|
|             |     |     | Issue | Comp | Result |       |
| LD          | F6  | 34+ | R2    | 1    | 3      | 4     |
| LD          | F2  | 45+ | R3    | 2    | 4      | 5     |
| MULTD       | F0  | F2  | F4    | 3    | 15     | 16    |
| SUBD        | F8  | F6  | F2    | 4    | 7      | 8     |
| DIVD        | F10 | F0  | F6    | 5    | 56     | 57    |
| ADDD        | F6  | F8  | F2    | 6    | 10     | 11    |

| Busy  | Address |
|-------|---------|
| Load1 | No      |
| Load2 | No      |
| Load3 | No      |

## Reservation Stations:

| Time | Name  | Busy | Op   | $V_j$ | $V_k$ | $Q_j$ | $Q_k$ |
|------|-------|------|------|-------|-------|-------|-------|
|      | Add1  | No   |      |       |       |       |       |
|      | Add2  | No   |      |       |       |       |       |
|      | Add3  | No   |      |       |       |       |       |
|      | Mult1 | No   |      |       |       |       |       |
|      | Mult2 | Yes  | DIVD | M*F4  | M(A1) |       |       |



## Register result status:

| Clock | F0 | F2   | F4    | F6 | F8           | F10    | F12 | ... | F30 |
|-------|----|------|-------|----|--------------|--------|-----|-----|-----|
| 56    | FU | M*F4 | M(A2) |    | (M-M+M)(M-M) | Result |     |     |     |

- Once again: In-order issue, out-of-order execution and completion.

# Compare to Scoreboard Cycle 62

---

*Instruction status:*

| Instruction | j   | k   | Issue | Read | Exec | Write  | Issue | Comp | Result | Exec | Write  |
|-------------|-----|-----|-------|------|------|--------|-------|------|--------|------|--------|
|             |     |     |       | Oper | Comp | Result |       |      |        | Comp | Result |
| LD          | F6  | 34+ | R2    | 1    | 2    | 3      | 4     |      |        | 3    | 4      |
| LD          | F2  | 45+ | R3    | 5    | 6    | 7      | 8     |      |        | 4    | 5      |
| MULTD       | F0  | F2  | F4    | 6    | 9    | 19     | 20    | 3    | 15     | 16   |        |
| SUBD        | F8  | F6  | F2    | 7    | 9    | 11     | 12    | 4    | 7      | 8    |        |
| DIVD        | F10 | F0  | F6    | 8    | 21   | 61     | 62    | 5    | 56     | 57   |        |
| ADDD        | F6  | F8  | F2    | 13   | 14   | 16     | 22    | 6    | 10     | 11   |        |

- Why take longer on scoreboard/6600?
  - Structural Hazards
  - Lack of forwarding

# **Tomasulo v. Scoreboard (IBM 360/91 v. CDC 6600)**

---

| <b>Pipelined Functional Units</b>    | <b>Multiple Functional Units</b>     |
|--------------------------------------|--------------------------------------|
| <b>(6 load, 3 store, 3 +, 2 x/÷)</b> | <b>(1 load/store, 1 +, 2 x, 1 ÷)</b> |

|                                                       |                                         |
|-------------------------------------------------------|-----------------------------------------|
| <b>window size: <math>\leq 14</math> instructions</b> | <b><math>\leq 5</math> instructions</b> |
|-------------------------------------------------------|-----------------------------------------|

|                                      |                             |
|--------------------------------------|-----------------------------|
| <b>No issue on structural hazard</b> | <b>same</b>                 |
| <b>WAR: renaming avoids</b>          | <b>stall completion</b>     |
| <b>WAW: renaming avoids</b>          | <b>stall issue</b>          |
| <b>Broadcast results from FU</b>     | <b>Write/read registers</b> |
| <b>Control: reservation stations</b> | <b>central scoreboard</b>   |

---

# **SUMMARY**

# Hardware Solution

---

- **Dynamic Scheduling**
  - Out-of-order execution and completion
- **Data Hazard via Register Renaming**
  - Dynamic RAW hazard detection and scheduling in data-flow fashion
  - Register renaming for WRA and WRA hazard (name conflict)
- **Implementations**
  - Scoreboard (CDC 6600 1963)
    - » Centralized register renaming
  - Tomasulo's Approach (IBM 360/91, 1966)
    - » Distributed control and renaming via reservation station, load/store buffer and common data bus (data+source)

# Register Renaming Summary

---

- **Purpose of Renaming: removing “Anti-dependencies”**
  - Get rid of WAR and WAW hazards, since these are not “real” dependencies
- **Implicit Renaming: i.e. Tomasulo**
  - Registers changed into values or response tags
  - We call this “implicit” because space in register file may or may not be used by results!
- **Explicit Renaming: more physical registers than needed by ISA.**
  - Rename table: tracks current association between architectural registers and physical registers
  - Uses a translation table to perform compiler-like transformation on the fly
- **With Explicit Renaming:**
  - All registers concentrated in single register file
  - Can utilize bypass network that looks more like 5-stage pipeline
  - Introduces a register-allocation problem
    - » Need to handle branch misprediction and precise exceptions differently, but ultimately makes things simpler

# **Class Lecture Stops Here**

---

---

# **TOMASULA LOOP EXAMPLE**

# Tomasulo Loop Example

---

|       |       |    |      |    |
|-------|-------|----|------|----|
| Loop: | LD    | F0 | 0    | R1 |
|       | MULTD | F4 | F0   | F2 |
|       | SD    | F4 | 0    | R1 |
|       | SUBI  | R1 | R1   | #8 |
|       | BNEZ  | R1 | Loop |    |

- Assume Multiply takes 4 clocks
- Assume first load takes 8 clocks (cache miss), second load takes 1 clock (hit)
- To be clear, will show clocks for SUBI, BNEZ
- Reality: integer instructions ahead

# Loop Example

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |            | Busy   | Addr | Fu |
|------|-------------|----|----|------------|------------|--------|------|----|
|      |             |    |    | Issue      | CompResult |        |      |    |
| 1    | LD          | F0 | 0  | R1         |            | Load1  | No   |    |
| 1    | MULTD       | F4 | F0 | F2         |            | Load2  | No   |    |
| 1    | SD          | F4 | 0  | R1         |            | Load3  | No   |    |
| 2    | LD          | F0 | 0  | R1         |            | Store1 | No   |    |
| 2    | MULTD       | F4 | F0 | F2         |            | Store2 | No   |    |
| 2    | SD          | F4 | 0  | R1         |            | Store3 | No   |    |

*Reservation Stations:*

| Time | Name  | Busy | Op | S1 S2 RS |    |    | Code: |                |
|------|-------|------|----|----------|----|----|-------|----------------|
|      |       |      |    | Vj       | Vk | Qj | Qk    |                |
|      | Add1  | No   |    |          |    |    |       | LD F0 0 R1     |
|      | Add2  | No   |    |          |    |    |       | MULTD F4 F0 F2 |
|      | Add3  | No   |    |          |    |    |       | SD F4 0 R1     |
|      | Mult1 | No   |    |          |    |    |       | SUBI R1 R1 #8  |
|      | Mult2 | No   |    |          |    |    |       | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|----|----|----|----|----|-----|-----|-----|-----|
| 0     | 80 | Fu |    |    |    |    |     |     |     |     |

# Loop Example Cycle 1

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |            | Busy   | Addr | Fu |
|------|-------------|----|----|------------|------------|--------|------|----|
|      |             |    |    | Issue      | CompResult |        |      |    |
| 1    | LD          | F0 | 0  | R1         | 1          | Load1  | Yes  | 80 |
| 1    | MULTD       | F4 | F0 | F2         |            | Load2  | No   |    |
| 1    | SD          | F4 | 0  | R1         |            | Load3  | No   |    |
| 2    | LD          | F0 | 0  | R1         |            | Store1 | No   |    |
| 2    | MULTD       | F4 | F0 | F2         |            | Store2 | No   |    |
| 2    | SD          | F4 | 0  | R1         |            | Store3 | No   |    |

*Reservation Stations:*

| Time | Name  | Busy | Op | RS |    |    | Code:          |
|------|-------|------|----|----|----|----|----------------|
|      |       |      |    | S1 | S2 | RS |                |
|      | Add1  | No   |    |    |    |    | LD F0 0 R1     |
|      | Add2  | No   |    |    |    |    | MULTD F4 F0 F2 |
|      | Add3  | No   |    |    |    |    | SD F4 0 R1     |
|      | Mult1 | No   |    |    |    |    | SUBI R1 R1 #8  |
|      | Mult2 | No   |    |    |    |    | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|----|----|-----|-----|-----|-----|
| 1     | 80 | Fu | Load1 |    |    |    |     |     |     |     |

# Loop Example Cycle 2

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |            | Busy   | Addr | Fu |
|------|-------------|----|----|------------|------------|--------|------|----|
|      |             |    |    | Issue      | CompResult |        |      |    |
| 1    | LD          | F0 | 0  | R1         | 1          | Load1  | Yes  | 80 |
| 1    | MULTD       | F4 | F0 | F2         | 2          | Load2  | No   |    |
| 1    | SD          | F4 | 0  | R1         |            | Load3  | No   |    |
| 2    | LD          | F0 | 0  | R1         |            | Store1 | No   |    |
| 2    | MULTD       | F4 | F0 | F2         |            | Store2 | No   |    |
| 2    | SD          | F4 | 0  | R1         |            | Store3 | No   |    |

*Reservation Stations:*

| Time | Name  | Busy | Op    | S1 S2 RS |       |       | Code:          |
|------|-------|------|-------|----------|-------|-------|----------------|
|      |       |      |       | Vj       | Vk    | Qj    |                |
|      | Add1  | No   |       |          |       |       | LD F0 0 R1     |
|      | Add2  | No   |       |          |       |       | MULTD F4 F0 F2 |
|      | Add3  | No   |       |          |       |       | SD F4 0 R1     |
|      | Mult1 | Yes  | Multd |          | R(F4) | Load1 | SUBI R1 R1 #8  |
|      | Mult2 | No   |       |          |       |       | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 2     | 80 | Fu | Load1 |    | Mult1 |    |     |     |     |     |

# Loop Example Cycle 3

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |            | Busy   | Addr | Fu    |
|------|-------------|----|----|------------|------------|--------|------|-------|
|      |             |    |    | Issue      | CompResult |        |      |       |
| 1    | LD          | F0 | 0  | R1         | 1          | Load1  | Yes  | 80    |
| 1    | MULTD       | F4 | F0 | F2         | 2          | Load2  | No   |       |
| 1    | SD          | F4 | 0  | R1         | 3          | Load3  | No   |       |
| 2    | LD          | F0 | 0  | R1         |            | Store1 | Yes  | 80    |
| 2    | MULTD       | F4 | F0 | F2         |            | Store2 | No   | Mult1 |
| 2    | SD          | F4 | 0  | R1         |            | Store3 | No   |       |

*Reservation Stations:*

| Time | Name  | Busy | Op    | S1 |    | S2    |       | RS | Code:          |
|------|-------|------|-------|----|----|-------|-------|----|----------------|
|      |       |      |       | Vj | Vk | Qj    | Qk    |    |                |
|      | Add1  | No   |       |    |    |       |       |    | LD F0 0 R1     |
|      | Add2  | No   |       |    |    |       |       |    | MULTD F4 F0 F2 |
|      | Add3  | No   |       |    |    |       |       |    | SD F4 0 R1     |
|      | Mult1 | Yes  | Multd |    |    | R(F4) | Load1 |    | SUBI R1 R1 #8  |
|      | Mult2 | No   |       |    |    |       |       |    | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | Fu | F0    | F2    | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|-------|----|----|----|-----|-----|-----|-----|
|       |    |    | Load1 | Mult1 |    |    |    |     |     |     |     |
| 3     | 80 |    |       |       |    |    |    |     |     |     |     |

- Implicit renaming sets up “DataFlow” graph

# Loop Example Cycle 4

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy   | Addr | Fu    |
|------|-------------|----|----|------------|------|--------|--------|------|-------|
|      |             |    |    | Issue      | Comp | Result |        |      |       |
| 1    | LD          | F0 | 0  | R1         | 1    |        | Load1  | Yes  | 80    |
| 1    | MULTD       | F4 | F0 | F2         | 2    |        | Load2  | No   |       |
| 1    | SD          | F4 | 0  | R1         | 3    |        | Load3  | No   |       |
| 2    | LD          | F0 | 0  | R1         |      |        | Store1 | Yes  | 80    |
| 2    | MULTD       | F4 | F0 | F2         |      |        | Store2 | No   | Mult1 |
| 2    | SD          | F4 | 0  | R1         |      |        | Store3 | No   |       |

*Reservation Stations:*

| Time | Name  | Busy | Op    | S1 |    | S2    |       | RS | Code:          |
|------|-------|------|-------|----|----|-------|-------|----|----------------|
|      |       |      |       | Vj | Vk | Qj    | Qk    |    |                |
|      | Add1  | No   |       |    |    |       |       |    | LD F0 0 R1     |
|      | Add2  | No   |       |    |    |       |       |    | MULTD F4 F0 F2 |
|      | Add3  | No   |       |    |    |       |       |    | SD F4 0 R1     |
|      | Mult1 | Yes  | Multd |    |    | R(F4) | Load1 |    | SUBI R1 R1 #8  |
|      | Mult2 | No   |       |    |    |       |       |    | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 |     | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|-----|
|       |    |    |       |    |       |    | F10 | F12 |     |     |     |
| 4     | 80 | Fu | Load1 |    | Mult1 |    |     |     |     |     |     |

- **Dispatching SUBI Instruction**

# Loop Example Cycle 5

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |            | Busy   | Addr | Fu    |
|------|-------------|----|----|------------|------------|--------|------|-------|
|      |             |    |    | Issue      | CompResult |        |      |       |
| 1    | LD          | F0 | 0  | R1         | 1          | Load1  | Yes  | 80    |
| 1    | MULTD       | F4 | F0 | F2         | 2          | Load2  | No   |       |
| 1    | SD          | F4 | 0  | R1         | 3          | Load3  | No   |       |
| 2    | LD          | F0 | 0  | R1         |            | Store1 | Yes  | 80    |
| 2    | MULTD       | F4 | F0 | F2         |            | Store2 | No   | Mult1 |
| 2    | SD          | F4 | 0  | R1         |            | Store3 | No   |       |

*Reservation Stations:*

| Time | Name  | Busy | Op    | RS    |       | Code:          |
|------|-------|------|-------|-------|-------|----------------|
|      |       |      |       | S1    | S2    |                |
|      | Add1  | No   |       |       |       | LD F0 0 R1     |
|      | Add2  | No   |       |       |       | MULTD F4 F0 F2 |
|      | Add3  | No   |       |       |       | SD F4 0 R1     |
|      | Mult1 | Yes  | Multd | R(F4) | Load1 | SUBI R1 R1 #8  |
|      | Mult2 | No   |       |       |       | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 5     | 72 | Fu | Load1 |    | Mult1 |    |     |     |     |     |

- And, BNEZ instruction

# Loop Example Cycle 6

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |            | Busy   | Addr | Fu    |
|------|-------------|----|----|------------|------------|--------|------|-------|
|      |             |    |    | Issue      | CompResult |        |      |       |
| 1    | LD          | F0 | 0  | R1         | 1          | Load1  | Yes  | 80    |
| 1    | MULTD       | F4 | F0 | F2         | 2          | Load2  | Yes  | 72    |
| 1    | SD          | F4 | 0  | R1         | 3          | Load3  | No   |       |
| 2    | LD          | F0 | 0  | R1         | 6          | Store1 | Yes  | 80    |
| 2    | MULTD       | F4 | F0 | F2         |            | Store2 | No   | Mult1 |
| 2    | SD          | F4 | 0  | R1         |            | Store3 | No   |       |

*Reservation Stations:*

| Time | Name  | Busy | Op    | RS |       |       | Code:          |
|------|-------|------|-------|----|-------|-------|----------------|
|      |       |      |       | S1 | S2    | RS    |                |
|      | Add1  | No   |       |    |       |       | LD F0 0 R1     |
|      | Add2  | No   |       |    |       |       | MULTD F4 F0 F2 |
|      | Add3  | No   |       |    |       |       | SD F4 0 R1     |
|      | Mult1 | Yes  | Multd |    | R(F4) | Load1 | SUBI R1 R1 #8  |
|      | Mult2 | No   |       |    |       |       | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 6     | 72 | Fu | Load2 |    | Mult1 |    |     |     |     |     |

- Notice that F0 never sees Load from location 80

# Loop Example Cycle 7

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |            | Busy   | Addr | Fu    |
|------|-------------|----|----|------------|------------|--------|------|-------|
|      |             |    |    | Issue      | CompResult |        |      |       |
| 1    | LD          | F0 | 0  | R1         | 1          | Load1  | Yes  | 80    |
| 1    | MULTD       | F4 | F0 | F2         | 2          | Load2  | Yes  | 72    |
| 1    | SD          | F4 | 0  | R1         | 3          | Load3  | No   |       |
| 2    | LD          | F0 | 0  | R1         | 6          | Store1 | Yes  | 80    |
| 2    | MULTD       | F4 | F0 | F2         | 7          | Store2 | No   | Mult1 |
| 2    | SD          | F4 | 0  | R1         |            | Store3 | No   |       |

*Reservation Stations:*

| Time | Name  | Busy | Op    | RS |       |       | Code:          |
|------|-------|------|-------|----|-------|-------|----------------|
|      |       |      |       | S1 | S2    | RS    |                |
|      | Add1  | No   |       |    |       |       | LD F0 0 R1     |
|      | Add2  | No   |       |    |       |       | MULTD F4 F0 F2 |
|      | Add3  | No   |       |    |       |       | SD F4 0 R1     |
|      | Mult1 | Yes  | Multd |    | R(F2) | Load1 | SUBI R1 R1 #8  |
|      | Mult2 | Yes  | Multd |    | R(F2) | Load2 | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 7     | 72 | Fu | Load2 |    | Mult2 |    |     |     |     |     |

- **Register file completely detached from computation**
- **First and Second iteration completely overlapped**

# Loop Example Cycle 8

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |            |
|------|-------------|----|----|------------|------------|
|      |             |    |    | Issue      | CompResult |
| 1    | LD          | F0 | 0  | R1         | 1          |
| 1    | MULTD       | F4 | F0 | F2         | 2          |
| 1    | SD          | F4 | 0  | R1         | 3          |
| 2    | LD          | F0 | 0  | R1         | 6          |
| 2    | MULTD       | F4 | F0 | F2         | 7          |
| 2    | SD          | F4 | 0  | R1         | 8          |

|        | Busy | Addr | Fu    |
|--------|------|------|-------|
| Load1  | Yes  | 80   |       |
| Load2  | Yes  | 72   |       |
| Load3  | No   |      |       |
| Store1 | Yes  | 80   | Mult1 |
| Store2 | Yes  | 72   | Mult2 |
| Store3 | No   |      |       |

*Reservation Stations:*

| Time | Name  | Busy | Op    | RS |       |       |
|------|-------|------|-------|----|-------|-------|
|      |       |      |       | S1 | S2    | RS    |
|      | Add1  | No   |       |    |       |       |
|      | Add2  | No   |       |    |       |       |
|      | Add3  | No   |       |    |       |       |
|      | Mult1 | Yes  | Multd |    | R(F2) | Load1 |
|      | Mult2 | Yes  | Multd |    | R(F2) | Load2 |

*Code:*

|       |    |      |    |
|-------|----|------|----|
| LD    | F0 | 0    | R1 |
| MULTD | F4 | F0   | F2 |
| SD    | F4 | 0    | R1 |
| SUBI  | R1 | R1   | #8 |
| BNEZ  | R1 | Loop |    |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 8     | 72 | Fu | Load2 |    | Mult2 |    |     |     |     |     |

# Loop Example Cycle 9

*Instruction status:*

| ITER | Instruction | j  | k  | Exec  |            | Busy   | Addr   | Fu    |
|------|-------------|----|----|-------|------------|--------|--------|-------|
|      |             |    |    | Issue | CompResult |        |        |       |
| 1    | LD          | F0 | 0  | R1    | 1 9        | Load1  | Yes 80 |       |
| 1    | MULTD       | F4 | F0 | F2    | 2          | Load2  | Yes 72 |       |
| 1    | SD          | F4 | 0  | R1    | 3          | Load3  | No     |       |
| 2    | LD          | F0 | 0  | R1    | 6          | Store1 | Yes 80 | Mult1 |
| 2    | MULTD       | F4 | F0 | F2    | 7          | Store2 | Yes 72 | Mult2 |
| 2    | SD          | F4 | 0  | R1    | 8          | Store3 | No     |       |

*Reservation Stations:*

| Time | Name  | Busy | Op    | RS |       |       | Code:          |
|------|-------|------|-------|----|-------|-------|----------------|
|      |       |      |       | S1 | S2    | RS    |                |
|      | Add1  | No   |       |    |       |       | LD F0 0 R1     |
|      | Add2  | No   |       |    |       |       | MULTD F4 F0 F2 |
|      | Add3  | No   |       |    |       |       | SD F4 0 R1     |
|      | Mult1 | Yes  | Multd |    | R(F2) | Load1 | SUBI R1 R1 #8  |
|      | Mult2 | Yes  | Multd |    | R(F2) | Load2 | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 9     | 72 | Fu | Load2 |    | Mult2 |    |     |     |     |     |

- **Load1 completing: who is waiting?**
- **Note: Dispatching SUBI**

# Loop Example Cycle 10

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu           |
|------|-------------|----|----|------------|------|--------|------|--------|--------------|
|      |             |    |    | Issue      | Comp | Result |      |        |              |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No           |
| 1    | MULTD       | F4 | F0 | F2         | 2    |        |      | Load2  | Yes 72       |
| 1    | SD          | F4 | 0  | R1         | 3    |        |      | Load3  | No           |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     |      | Store1 | Yes 80 Mult1 |
| 2    | MULTD       | F4 | F0 | F2         | 7    |        |      | Store2 | Yes 72 Mult2 |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | No           |

*Reservation Stations:*

| Time | Name  | Busy | Op                | RS    |       |    | Code:          |
|------|-------|------|-------------------|-------|-------|----|----------------|
|      |       |      |                   | S1    | S2    | RS |                |
|      | Add1  | No   |                   |       |       |    | LD F0 0 R1     |
|      | Add2  | No   |                   |       |       |    | MULTD F4 F0 F2 |
|      | Add3  | No   |                   |       |       |    | SD F4 0 R1     |
| 4    | Mult1 | Yes  | Multd M[80] R(F2) |       |       |    | SUBI R1 R1 #8  |
|      | Mult2 | Yes  | Multd             | R(F2) | Load2 |    | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 10    | 64 | Fu | Load2 |    | Mult2 |    |     |     |     |     |

- **Load2 completing: who is waiting?**
- **Note: Dispatching BNEZ**

# Loop Example Cycle 11

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu           |
|------|-------------|----|----|------------|------|--------|------|--------|--------------|
|      |             |    |    | Issue      | Comp | Result |      |        |              |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No           |
| 1    | MULTD       | F4 | F0 | F2         | 2    |        |      | Load2  | No           |
| 1    | SD          | F4 | 0  | R1         | 3    |        |      | Load3  | Yes 64       |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | Yes 80 Mult1 |
| 2    | MULTD       | F4 | F0 | F2         | 7    |        |      | Store2 | Yes 72 Mult2 |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | No           |

*Reservation Stations:*

| Time | Name  | Busy | Op                | RS |    |    | Code:          |
|------|-------|------|-------------------|----|----|----|----------------|
|      |       |      |                   | S1 | S2 | RS |                |
|      | Add1  | No   |                   |    |    |    | LD F0 0 R1     |
|      | Add2  | No   |                   |    |    |    | MULTD F4 F0 F2 |
|      | Add3  | No   |                   |    |    |    | SD F4 0 R1     |
| 3    | Mult1 | Yes  | Multd M[80] R(F2) |    |    |    | SUBI R1 R1 #8  |
| 4    | Mult2 | Yes  | Multd M[72] R(F2) |    |    |    | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 11    | 64 | Fu | Load3 |    | Mult2 |    |     |     |     |     |

- **Next load in sequence**

# Loop Example Cycle 12

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu           |
|------|-------------|----|----|------------|------|--------|------|--------|--------------|
|      |             |    |    | Issue      | Comp | Result |      |        |              |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No           |
| 1    | MULTD       | F4 | F0 | F2         | 2    |        |      | Load2  | No           |
| 1    | SD          | F4 | 0  | R1         | 3    |        |      | Load3  | Yes 64       |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | Yes 80 Mult1 |
| 2    | MULTD       | F4 | F0 | F2         | 7    |        |      | Store2 | Yes 72 Mult2 |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | No           |

*Reservation Stations:*

| Time | Name  | Busy | Op                | RS |    |    | Code:          |
|------|-------|------|-------------------|----|----|----|----------------|
|      |       |      |                   | S1 | S2 | RS |                |
|      | Add1  | No   |                   |    |    |    | LD F0 0 R1     |
|      | Add2  | No   |                   |    |    |    | MULTD F4 F0 F2 |
|      | Add3  | No   |                   |    |    |    | SD F4 0 R1     |
| 2    | Mult1 | Yes  | Multd M[80] R(F2) |    |    |    | SUBI R1 R1 #8  |
| 3    | Mult2 | Yes  | Multd M[72] R(F2) |    |    |    | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 12    | 64 | Fu | Load3 |    | Mult2 |    |     |     |     |     |

- Why not issue third multiply?

# Loop Example Cycle 13

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu           |
|------|-------------|----|----|------------|------|--------|------|--------|--------------|
|      |             |    |    | Issue      | Comp | Result |      |        |              |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No           |
| 1    | MULTD       | F4 | F0 | F2         | 2    |        |      | Load2  | No           |
| 1    | SD          | F4 | 0  | R1         | 3    |        |      | Load3  | Yes 64       |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | Yes 80 Mult1 |
| 2    | MULTD       | F4 | F0 | F2         | 7    |        |      | Store2 | Yes 72 Mult2 |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | No           |

*Reservation Stations:*

| Time | Name  | Busy | RS    |       |       | Code:          |
|------|-------|------|-------|-------|-------|----------------|
|      |       |      | Op    | Vj    | Vk    |                |
|      | Add1  | No   |       |       |       | LD F0 0 R1     |
|      | Add2  | No   |       |       |       | MULTD F4 F0 F2 |
|      | Add3  | No   |       |       |       | SD F4 0 R1     |
| 1    | Mult1 | Yes  | Multd | M[80] | R(F2) | SUBI R1 R1 #8  |
| 2    | Mult2 | Yes  | Multd | M[72] | R(F2) | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 13    | 64 | Fu | Load3 |    | Mult2 |    |     |     |     |     |

# Loop Example Cycle 14

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu           |
|------|-------------|----|----|------------|------|--------|------|--------|--------------|
|      |             |    |    | Issue      | Comp | Result |      |        |              |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No           |
| 1    | MULTD       | F4 | F0 | F2         | 2    | 14     |      | Load2  | No           |
| 1    | SD          | F4 | 0  | R1         | 3    |        |      | Load3  | Yes 64       |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | Yes 80 Mult1 |
| 2    | MULTD       | F4 | F0 | F2         | 7    |        |      | Store2 | Yes 72 Mult2 |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | No           |

*Reservation Stations:*

| Time | Name  | Busy | Op                | RS |    |    | Code:          |
|------|-------|------|-------------------|----|----|----|----------------|
|      |       |      |                   | S1 | S2 | RS |                |
|      | Add1  | No   |                   |    |    |    | LD F0 0 R1     |
|      | Add2  | No   |                   |    |    |    | MULTD F4 F0 F2 |
|      | Add3  | No   |                   |    |    |    | SD F4 0 R1     |
| 0    | Mult1 | Yes  | Multd M[80] R(F2) |    |    |    | SUBI R1 R1 #8  |
| 1    | Mult2 | Yes  | Multd M[72] R(F2) |    |    |    | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 14    | 64 | Fu | Load3 |    | Mult2 |    |     |     |     |     |

- **Mult1 completing. Who is waiting?**

# Loop Example Cycle 15

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu             |
|------|-------------|----|----|------------|------|--------|------|--------|----------------|
|      |             |    |    | Issue      | Comp | Result |      |        |                |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No             |
| 1    | MULTD       | F4 | F0 | F2         | 2    | 14     | 15   | Load2  | No             |
| 1    | SD          | F4 | 0  | R1         | 3    |        |      | Load3  | Yes 64         |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | Yes 80 [80]*R2 |
| 2    | MULTD       | F4 | F0 | F2         | 7    | 15     |      | Store2 | Yes 72 Mult2   |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | No             |

*Reservation Stations:*

| Time | Name  | Busy | Op    | RS    |       |    | Code:          |
|------|-------|------|-------|-------|-------|----|----------------|
|      |       |      |       | S1    | S2    | RS |                |
|      | Add1  | No   |       |       |       |    | LD F0 0 R1     |
|      | Add2  | No   |       |       |       |    | MULTD F4 F0 F2 |
|      | Add3  | No   |       |       |       |    | SD F4 0 R1     |
|      | Mult1 | No   |       |       |       |    | SUBI R1 R1 #8  |
| 0    | Mult2 | Yes  | Multd | M[72] | R(F2) |    | BNEZ R1 Loop   |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 15    | 64 | Fu | Load3 |    | Mult2 |    |     |     |     |     |

- **Mult2 completing. Who is waiting?**

# Loop Example Cycle 16

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu             |
|------|-------------|----|----|------------|------|--------|------|--------|----------------|
|      |             |    |    | Issue      | Comp | Result |      |        |                |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No             |
| 1    | MULTD       | F4 | F0 | F2         | 2    | 14     | 15   | Load2  | No             |
| 1    | SD          | F4 | 0  | R1         | 3    |        |      | Load3  | Yes 64         |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | Yes 80 [80]*R2 |
| 2    | MULTD       | F4 | F0 | F2         | 7    | 15     | 16   | Store2 | Yes 72 [72]*R2 |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | No             |

*Reservation Stations:*

| Time | Name  | Busy | SI    |    |       | RS    | Code: |          |
|------|-------|------|-------|----|-------|-------|-------|----------|
|      |       |      | Op    | Vj | Vk    | Qj    | Qk    |          |
|      | Add1  | No   |       |    |       |       | LD    | F0 0 R1  |
|      | Add2  | No   |       |    |       |       | MULTD | F4 F0 F2 |
|      | Add3  | No   |       |    |       |       | SD    | F4 0 R1  |
|      | Mult1 | Yes  | Multd |    | R(F2) | Load3 | SUBI  | R1 R1 #8 |
|      | Mult2 | No   |       |    |       |       | BNEZ  | R1 Loop  |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 16    | 64 | Fu | Load3 |    | Mult1 |    |     |     |     |     |

# Loop Example Cycle 17

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu             |
|------|-------------|----|----|------------|------|--------|------|--------|----------------|
|      |             |    |    | Issue      | Comp | Result |      |        |                |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No             |
| 1    | MULTD       | F4 | F0 | F2         | 2    | 14     | 15   | Load2  | No             |
| 1    | SD          | F4 | 0  | R1         | 3    |        |      | Load3  | Yes 64         |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | Yes 80 [80]*R2 |
| 2    | MULTD       | F4 | F0 | F2         | 7    | 15     | 16   | Store2 | Yes 72 [72]*R2 |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | Yes 64 Mult1   |

*Reservation Stations:*

| Time | Name  | Busy | S1 S2 RS |    |       | Code:          |               |
|------|-------|------|----------|----|-------|----------------|---------------|
|      |       |      | Op       | Vj | Vk    |                |               |
|      | Add1  | No   |          |    |       | LD F0 0 R1     |               |
|      | Add2  | No   |          |    |       | MULTD F4 F0 F2 |               |
|      | Add3  | No   |          |    |       | SD F4 0 R1     |               |
|      | Mult1 | Yes  | Multd    |    | R(F2) | Load3          | SUBI R1 R1 #8 |
|      | Mult2 | No   |          |    |       |                | BNEZ R1 Loop  |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 17    | 64 | Fu | Load3 |    | Mult1 |    |     |     |     |     |

# Loop Example Cycle 18

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu             |
|------|-------------|----|----|------------|------|--------|------|--------|----------------|
|      |             |    |    | Issue      | Comp | Result |      |        |                |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No             |
| 1    | MULTD       | F4 | F0 | F2         | 2    | 14     | 15   | Load2  | No             |
| 1    | SD          | F4 | 0  | R1         | 3    | 18     |      | Load3  | Yes 64         |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | Yes 80 [80]*R2 |
| 2    | MULTD       | F4 | F0 | F2         | 7    | 15     | 16   | Store2 | Yes 72 [72]*R2 |
| 2    | SD          | F4 | 0  | R1         | 8    |        |      | Store3 | Yes 64 Mult1   |

*Reservation Stations:*

| Time | Name  | Busy | S1 S2 RS |    |       | Code:          |               |
|------|-------|------|----------|----|-------|----------------|---------------|
|      |       |      | Op       | Vj | Vk    |                |               |
|      | Add1  | No   |          |    |       | LD F0 0 R1     |               |
|      | Add2  | No   |          |    |       | MULTD F4 F0 F2 |               |
|      | Add3  | No   |          |    |       | SD F4 0 R1     |               |
|      | Mult1 | Yes  | Multd    |    | R(F2) | Load3          | SUBI R1 R1 #8 |
|      | Mult2 | No   |          |    |       |                | BNEZ R1 Loop  |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 18    | 64 | Fu | Load3 |    | Mult1 |    |     |     |     |     |

# Loop Example Cycle 19

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu             |
|------|-------------|----|----|------------|------|--------|------|--------|----------------|
|      |             |    |    | Issue      | Comp | Result |      |        |                |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No             |
| 1    | MULTD       | F4 | F0 | F2         | 2    | 14     | 15   | Load2  | No             |
| 1    | SD          | F4 | 0  | R1         | 3    | 18     | 19   | Load3  | Yes 64         |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | No             |
| 2    | MULTD       | F4 | F0 | F2         | 7    | 15     | 16   | Store2 | Yes 72 [72]*R2 |
| 2    | SD          | F4 | 0  | R1         | 8    | 19     |      | Store3 | Yes 64 Mult1   |

*Reservation Stations:*

| Time | Name  | Busy | RS    |    |       | Code:          |               |
|------|-------|------|-------|----|-------|----------------|---------------|
|      |       |      | Op    | Vj | Vk    |                |               |
|      | Add1  | No   |       |    |       | LD F0 0 R1     |               |
|      | Add2  | No   |       |    |       | MULTD F4 F0 F2 |               |
|      | Add3  | No   |       |    |       | SD F4 0 R1     |               |
|      | Mult1 | Yes  | Multd |    | R(F2) | Load3          | SUBI R1 R1 #8 |
|      | Mult2 | No   |       |    |       |                | BNEZ R1 Loop  |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 19    | 64 | Fu | Load3 |    | Mult1 |    |     |     |     |     |

# Loop Example Cycle 20

*Instruction status:*

| ITER | Instruction | j  | k  | Exec Write |      |        | Busy | Addr   | Fu           |
|------|-------------|----|----|------------|------|--------|------|--------|--------------|
|      |             |    |    | Issue      | Comp | Result |      |        |              |
| 1    | LD          | F0 | 0  | R1         | 1    | 9      | 10   | Load1  | No           |
| 1    | MULTD       | F4 | F0 | F2         | 2    | 14     | 15   | Load2  | No           |
| 1    | SD          | F4 | 0  | R1         | 3    | 18     | 19   | Load3  | Yes 64       |
| 2    | LD          | F0 | 0  | R1         | 6    | 10     | 11   | Store1 | No           |
| 2    | MULTD       | F4 | F0 | F2         | 7    | 15     | 16   | Store2 | No           |
| 2    | SD          | F4 | 0  | R1         | 8    | 19     | 20   | Store3 | Yes 64 Mult1 |

*Reservation Stations:*

| Time | Name  | Busy | RS    |    |       | Code:          |               |
|------|-------|------|-------|----|-------|----------------|---------------|
|      |       |      | Op    | Vj | Vk    |                |               |
|      | Add1  | No   |       |    |       | LD F0 0 R1     |               |
|      | Add2  | No   |       |    |       | MULTD F4 F0 F2 |               |
|      | Add3  | No   |       |    |       | SD F4 0 R1     |               |
|      | Mult1 | Yes  | Multd |    | R(F2) | Load3          | SUBI R1 R1 #8 |
|      | Mult2 | No   |       |    |       |                | BNEZ R1 Loop  |

*Register result status*

| Clock | R1 | F0 | F2    | F4 | F6    | F8 | F10 | F12 | ... | F30 |
|-------|----|----|-------|----|-------|----|-----|-----|-----|-----|
| 20    | 64 | Fu | Load3 |    | Mult1 |    |     |     |     |     |

# Why can Tomasulo overlap iterations of loops?

---

- **Register renaming**
  - Multiple iterations use different physical destinations for registers (dynamic loop unrolling).
- **Reservation stations**
  - Permit instruction issue to advance past integer control flow operations
- **Other idea: Tomasulo building dynamic “DataFlow” graph from instructions**
  - Fits in with readings for Wednesday

# Summary

---

- **Scoreboard: Track dependencies through reservations**
  - Simple scheme for out-of-order execution
  - WAW and WAR hazards force stalls – cannot handle multiple instructions with same destination register
- **Reservations stations: *renaming* to larger set of registers + buffering source operands**
  - Prevents registers as bottleneck
  - Avoids WAR, WAW hazards of Scoreboard
  - Allows loop unrolling in HW
- **Dynamic hardware schemes can unroll loops dynamically in hardware**
  - Form of limited dataflow
  - Register renaming is essential
- **Lasting Contributions of Tomasulo Algorithm**
  - Dynamic scheduling
  - Register renaming
  - Load/store disambiguation
- **360/91 descendants are Pentium II; PowerPC 604; MIPS R10000; HP-PA 8000; Alpha 21264**

---

# SCOREBOARD

# Scoreboard: a bookkeeping technique

---

- **Out-of-order execution divides ID stage:**
  1. **Issue**—decode instructions, check for structural hazards
  2. **Read operands**—wait until no data hazards, then read operands
- **Scoreboards date to CDC6600 in 1963**
  - Readings for Monday include one on CDC6600
- **Instructions execute whenever not dependent on previous instructions and no hazards.**
- **CDC 6600: In order issue, out-of-order execution, out-of-order commit (or completion)**
  - **No forwarding!**
  - **Imprecise interrupt/exception model for now**

# Scoreboard Architecture (CDC 6600)



# Scoreboard Implications

---

- **Out-of-order completion => WAR, WAW hazards?**
- **Solutions for WAR:**
  - Stall writeback until registers have been read
  - Read registers only during Read Operands stage
- **Solution for WAW:**
  - Detect hazard and stall issue of new instruction until other instruction completes
- **No register renaming**
- **Need to have multiple instructions in execution phase => multiple execution units or pipelined execution units**
- **Scoreboard keeps track of dependencies between instructions that have already issued**
- **Scoreboard replaces ID, EX, WB with 4 stages**

# Four Stages of Scoreboard Control

---

- **Issue**—decode instructions & check for structural hazards (ID1)
  - Instructions issued in program order (for hazard checking)
  - Don't issue if structural hazard
  - Don't issue if instruction is output dependent on any previously issued but uncompleted instruction (no WAW hazards)
- **Read operands**—wait until no data hazards, then read operands (ID2)
  - All real dependencies (RAW hazards) resolved in this stage, since we wait for instructions to write back data.
  - No forwarding of data in this model!

# Four Stages of Scoreboard Control

---

- **Execution**—operate on operands (EX)
  - The functional unit begins execution upon receiving operands. When the result is ready, it notifies the scoreboard that it has completed execution.
- **Write result**—finish execution (WB)
  - Stall until no WAR hazards with previous instructions:

Example:

|      |             |
|------|-------------|
| DIVD | F0, F2, F4  |
| ADDD | F10, F0, F8 |
| SUBD | F8, F8, F14 |

CDC 6600 scoreboard would stall SUBD until ADDD reads operands

# Three Parts of the Scoreboard

---

- **Instruction status:**  
Which of 4 steps the instruction is in
- **Functional unit status:**—Indicates the state of the functional unit (FU). 9 fields for each functional unit
  - Busy:** Indicates whether the unit is busy or not
  - Op:** Operation to perform in the unit (e.g., + or -)
  - Fi:** Destination register
  - Fj,Fk:** Source-register numbers
  - Qj,Qk:** Functional units producing source registers Fj, Fk
  - Rj,Rk:** Flags indicating when Fj, Fk are ready
- **Register result status**—Indicates which functional unit will write each register, if one exists. Blank when no pending instructions will write that register

# Scoreboard Example

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              |      |      |        |
| LD          | F2  | 45+ | R3              |      |      |        |
| MULTD       | F0  | F2  | F4              |      |      |        |
| SUBD        | F8  | F6  | F2              |      |      |        |
| DIVD        | F10 | F0  | F6              |      |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | dest |    | S1 | S2 | FU | FU | Fj? | Fk? |    |
|------|---------|------|----|----|----|----|----|-----|-----|----|
|      |         | Busy | Op | Fi | Fj | Fk | Qj | Qk  | Rj  | Rk |
|      | Integer |      |    | No |    |    |    |     |     |    |
|      | Mult1   |      |    | No |    |    |    |     |     |    |
|      | Mult2   |      |    | No |    |    |    |     |     |    |
|      | Add     |      |    | No |    |    |    |     |     |    |
|      | Divide  |      |    | No |    |    |    |     |     |    |

*Register result status:*

|       |    |    |    |    |    |     |     |     |     |
|-------|----|----|----|----|----|-----|-----|-----|-----|
| Clock | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ... | F30 |
| FU    |    |    |    |    |    |     |     |     |     |

# Detailed Scoreboard Pipeline Control

| Instruction status | Wait until                                                                                                                                                                             | Bookkeeping                                                                                                                                                                                                                                                                                                                                                                                       |
|--------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Issue              | Not busy (FU) and not result(D)                                                                                                                                                        | $\text{Busy(FU)} \leftarrow \text{yes}; \text{Op(FU)} \leftarrow \text{op};$<br>$\text{Fi(FU)} \leftarrow 'D'; \text{Fj(FU)} \leftarrow 'S1';$<br>$\text{Fk(FU)} \leftarrow 'S2'; \text{Qj} \leftarrow \text{Result('S1')};$<br>$\text{Qk} \leftarrow \text{Result('S2')}; \text{Rj} \leftarrow \text{not Qj};$<br>$\text{Rk} \leftarrow \text{not Qk}; \text{Result('D')} \leftarrow \text{FU};$ |
| Read operands      | Rj and Rk                                                                                                                                                                              | $\text{Rj} \leftarrow \text{No}; \text{Rk} \leftarrow \text{No}$                                                                                                                                                                                                                                                                                                                                  |
| Execution complete | Functional unit done                                                                                                                                                                   |                                                                                                                                                                                                                                                                                                                                                                                                   |
| Write result       | $\forall f ((\text{Fj}(f) \neq \text{Fi(FU)})$<br>$\text{or } \text{Rj}(f) = \text{No}) \text{ & }$<br>$(\text{Fk}(f) \neq \text{Fi(FU)}) \text{ or }$<br>$\text{Rk}(f) = \text{No}))$ | $\forall f (\text{if } \text{Qj}(f) = \text{FU} \text{ then } \text{Rj}(f) \leftarrow \text{Yes});$<br>$\forall f (\text{if } \text{Qk}(f) = \text{FU} \text{ then } \text{Rj}(f) \leftarrow \text{Yes});$<br>$\text{Result}(\text{Fi(FU)}) \leftarrow 0; \text{Busy(FU)} \leftarrow \text{No}$                                                                                                   |

# Scoreboard Example: Cycle 1

*Instruction status:*

| Instruction | j   | k   | Issue | Read | Exec | Write  |
|-------------|-----|-----|-------|------|------|--------|
|             |     |     |       | Op   | Comp | Result |
| LD          | F6  | 34+ | R2    | 1    |      |        |
| LD          | F2  | 45+ | R3    |      |      |        |
| MULTD       | F0  | F2  | F4    |      |      |        |
| SUBD        | F8  | F6  | F2    |      |      |        |
| DIVD        | F10 | F0  | F6    |      |      |        |
| ADDD        | F6  | F8  | F2    |      |      |        |

*Functional unit status:*

| Time | Name    | dest | S1   | S2 | FU | FU | Fj? | Fk? |     |
|------|---------|------|------|----|----|----|-----|-----|-----|
|      |         | Busy | Op   | Fi | Fj | Fk | Qj  | Qk  | Rj  |
|      | Integer | Yes  | Load | F6 |    | R2 |     |     | Yes |
|      | Mult1   | No   |      |    |    |    |     |     |     |
|      | Mult2   | No   |      |    |    |    |     |     |     |
|      | Add     | No   |      |    |    |    |     |     |     |
|      | Divide  | No   |      |    |    |    |     |     |     |

*Register result status:*

| Clock | F0 | F2 | F4 | F6      | F8 | F10 | F12 | ... | F30 |
|-------|----|----|----|---------|----|-----|-----|-----|-----|
| 1     | FU |    |    | FU      |    |     |     |     |     |
|       |    |    |    | Integer |    |     |     |     |     |

# Scoreboard Example: Cycle 2

*Instruction status:*

| Instruction | <i>j</i> | <i>k</i> | <i>Issue</i> | Read<br>Oper | Exec<br>Comp | Write<br>Result |
|-------------|----------|----------|--------------|--------------|--------------|-----------------|
| LD          | F6       | 34+      | R2           | 1            | 2            |                 |
| LD          | F2       | 45+      | R3           |              |              |                 |
| MULTD       | F0       | F2       | F4           |              |              |                 |
| SUBD        | F8       | F6       | F2           |              |              |                 |
| DIVD        | F10      | F0       | F6           |              |              |                 |
| ADDD        | F6       | F8       | F2           |              |              |                 |

*Functional unit status:*

| <i>Time</i> | <i>Name</i> | <i>dest</i> |           | <i>S1</i> | <i>S2</i> | <i>FU</i> | <i>FU</i> | <i>Fj?</i> | <i>Fk?</i> |           |
|-------------|-------------|-------------|-----------|-----------|-----------|-----------|-----------|------------|------------|-----------|
|             |             | <i>Busy</i> | <i>Op</i> | <i>Fi</i> | <i>Fj</i> | <i>Fk</i> | <i>Qj</i> | <i>Qk</i>  | <i>Rj</i>  | <i>Rk</i> |
|             | Integer     | Yes         | Load      | F6        |           | R2        |           |            |            | Yes       |
|             | Mult1       | No          |           |           |           |           |           |            |            |           |
|             | Mult2       | No          |           |           |           |           |           |            |            |           |
|             | Add         | No          |           |           |           |           |           |            |            |           |
|             | Divide      | No          |           |           |           |           |           |            |            |           |

*Register result status:*

| Clock | <i>F0</i> | <i>F2</i> | <i>F4</i> | <i>F6</i> | <i>F8</i> | <i>F10</i> | <i>F12</i> | ... | <i>F30</i> |
|-------|-----------|-----------|-----------|-----------|-----------|------------|------------|-----|------------|
| 2     | <i>FU</i> |           |           |           |           |            |            |     |            |

- **Issue 2nd LD?**

# Scoreboard Example: Cycle 3

*Instruction status:*

| Instruction | j   | k   | Issue | Read | Exec   | Write |  |
|-------------|-----|-----|-------|------|--------|-------|--|
|             |     |     | Open  | Comp | Result |       |  |
| LD          | F6  | 34+ | R2    | 1    | 2      | 3     |  |
| LD          | F2  | 45+ | R3    |      |        |       |  |
| MULTD       | F0  | F2  | F4    |      |        |       |  |
| SUBD        | F8  | F6  | F2    |      |        |       |  |
| DIVD        | F10 | F0  | F6    |      |        |       |  |
| ADDD        | F6  | F8  | F2    |      |        |       |  |

*Functional unit status:*

| Time | Name    | dest |      | S1 | S2 | FU | FU | Fj? | Fk? |
|------|---------|------|------|----|----|----|----|-----|-----|
|      |         | Busy | Op   | Fi | Fj | Fk | Qj | Qk  | Rj  |
|      | Integer | Yes  | Load | F6 |    | R2 |    |     | No  |
|      | Mult1   | No   |      |    |    |    |    |     |     |
|      | Mult2   | No   |      |    |    |    |    |     |     |
|      | Add     | No   |      |    |    |    |    |     |     |
|      | Divide  | No   |      |    |    |    |    |     |     |

*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|----|----|----|----|-----|-----|-----|-----|
| 3     | FU |    |    |    |    |     |     |     |     |

- **Issue MULT?**

# Scoreboard Example: Cycle 4

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3      |
| LD          | F2  | 45+ | R3              |      |      | 4      |
| MULTD       | F0  | F2  | F4              |      |      |        |
| SUBD        | F8  | F6  | F2              |      |      |        |
| DIVD        | F10 | F0  | F6              |      |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | dest |    | S1             | S2             | FU             | FU             | Fj?            | Fk?            |                |
|------|---------|------|----|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
|      |         | Busy | Op | F <sub>i</sub> | F <sub>j</sub> | F <sub>k</sub> | Q <sub>j</sub> | Q <sub>k</sub> | R <sub>j</sub> | R <sub>k</sub> |
|      | Integer | No   |    |                |                |                |                |                |                |                |
|      | Mult1   | No   |    |                |                |                |                |                |                |                |
|      | Mult2   | No   |    |                |                |                |                |                |                |                |
|      | Add     | No   |    |                |                |                |                |                |                |                |
|      | Divide  | No   |    |                |                |                |                |                |                |                |

*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|----|----|----|----|-----|-----|-----|-----|
| 4     | FU |    |    |    |    |     |     |     |     |

# Scoreboard Example: Cycle 5

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3      |
| LD          | F2  | 45+ | R3              | 5    |      |        |
| MULTD       | F0  | F2  | F4              |      |      |        |
| SUBD        | F8  | F6  | F2              |      |      |        |
| DIVD        | F10 | F0  | F6              |      |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | dest |      |                |                | S1             | S2             | FU             | FU             | Fj?            | Fk? |
|------|---------|------|------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----|
|      |         | Busy | Op   | F <sub>i</sub> | F <sub>j</sub> | F <sub>k</sub> | Q <sub>j</sub> | Q <sub>k</sub> | R <sub>j</sub> | R <sub>k</sub> |     |
|      | Integer | Yes  | Load | F2             |                | R3             |                |                |                | Yes            |     |
|      | Mult1   | No   |      |                |                |                |                |                |                |                |     |
|      | Mult2   | No   |      |                |                |                |                |                |                |                |     |
|      | Add     | No   |      |                |                |                |                |                |                |                |     |
|      | Divide  | No   |      |                |                |                |                |                |                |                |     |

*Register result status:*

| Clock | F0 | F2      | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|---------|----|----|----|-----|-----|-----|-----|
| 5     | FU | Integer |    |    |    |     |     |     |     |

# Scoreboard Example: Cycle 6

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3      |
| LD          | F2  | 45+ | R3              | 5    | 6    |        |
| MULTD       | F0  | F2  | F4              | 6    |      |        |
| SUBD        | F8  | F6  | F2              |      |      |        |
| DIVD        | F10 | F0  | F6              |      |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | dest |      | S1 | S2 | FU | FU      | Fj? | Fk? |
|------|---------|------|------|----|----|----|---------|-----|-----|
|      |         | Busy | Op   | Fi | Fj | Fk | Qj      | Qk  | Rj  |
|      | Integer | Yes  | Load | F2 |    | R3 |         |     | Yes |
|      | Mult1   | Yes  | Mult | F0 | F2 | F4 | Integer |     | No  |
|      | Mult2   | No   |      |    |    |    |         |     | Yes |
|      | Add     | No   |      |    |    |    |         |     |     |
|      | Divide  | No   |      |    |    |    |         |     |     |

*Register result status:*

| Clock | F0 | F2    | F4      | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|-------|---------|----|----|-----|-----|-----|-----|
| 6     | FU | Mult1 | Integer |    |    |     |     |     |     |

# Scoreboard Example: Cycle 7

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3      |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7      |
| MULTD       | F0  | F2  | F4              | 6    |      |        |
| SUBD        | F8  | F6  | F2              | 7    |      |        |
| DIVD        | F10 | F0  | F6              |      |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | dest |      | S1 | S2 | FU | FU      | Fj? | Fk? |
|------|---------|------|------|----|----|----|---------|-----|-----|
|      |         | Busy | Op   | Fi | Fj | Fk | Qj      | Qk  | Rj  |
|      | Integer | Yes  | Load | F2 |    | R3 |         |     | No  |
|      | Mult1   | Yes  | Mult | F0 | F2 | F4 | Integer |     | No  |
|      | Mult2   | No   |      |    |    |    |         |     | Yes |
|      | Add     | Yes  | Sub  | F8 | F6 | F2 | Integer | Yes | No  |
|      | Divide  | No   |      |    |    |    |         |     |     |

*Register result status:*

| Clock | F0 | F2    | F4      | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|-------|---------|----|----|-----|-----|-----|-----|
| 7     | FU | Mult1 | Integer |    |    | Add |     |     |     |

- **Read multiply operands?**

# Scoreboard Example: Cycle 8a

## (First half of clock cycle)

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |   |
|-------------|-----|-----|-----------------|------|------|--------|---|
|             |     |     | Issue           | Oper | Comp | Result |   |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3      | 4 |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7      |   |
| MULTD       | F0  | F2  | F4              | 6    |      |        |   |
| SUBD        | F8  | F6  | F2              | 7    |      |        |   |
| DIVD        | F10 | F0  | F6              | 8    |      |        |   |
| ADDD        | F6  | F8  | F2              |      |      |        |   |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU      | Fj? | Fk? |     |
|------|---------|------|------|-----|----|----|---------|-----|-----|-----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj      | Qk  | Rj  | Rk  |
|      | Integer | Yes  | Load | F2  |    | R3 |         |     | No  |     |
|      | Mult1   | Yes  | Mult | F0  | F2 | F4 | Integer |     | No  | Yes |
|      | Mult2   | No   |      |     |    |    |         |     |     |     |
|      | Add     | Yes  | Sub  | F8  | F6 | F2 | Integer | Yes | No  |     |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1   | No  | Yes |     |

*Register result status:*

| Clock | F0 | F2    | F4      | F6 | F8  | F10    | F12 | ... | F30 |
|-------|----|-------|---------|----|-----|--------|-----|-----|-----|
| 8     | FU | Mult1 | Integer |    | Add | Divide |     |     |     |

# Scoreboard Example: Cycle 8b (Second half of clock cycle)

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |   |
|-------------|-----|-----|-----------------|------|------|--------|---|
|             |     |     | Issue           | Oper | Comp | Result |   |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3      | 4 |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7      | 8 |
| MULTD       | F0  | F2  | F4              | 6    |      |        |   |
| SUBD        | F8  | F6  | F2              | 7    |      |        |   |
| DIVD        | F10 | F0  | F6              | 8    |      |        |   |
| ADDD        | F6  | F8  | F2              |      |      |        |   |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |     |
|------|---------|------|------|-----|----|----|-------|-----|-----|-----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  | Rk  |
|      | Integer | No   |      |     |    |    |       |     |     |     |
|      | Mult1   | Yes  | Mult | F0  | F2 | F4 |       |     | Yes | Yes |
|      | Mult2   | No   |      |     |    |    |       |     |     |     |
|      | Add     | Yes  | Sub  | F8  | F6 | F2 |       |     | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1 |     | No  | Yes |

*Register result status:*

| Clock | F0 | F2    | F4 | F6 | F8  | F10    | F12 | ... | F30 |
|-------|----|-------|----|----|-----|--------|-----|-----|-----|
| 8     | FU | Mult1 |    |    | Add | Divide |     |     |     |

# Scoreboard Example: Cycle 9

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    |        |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |
|------|---------|------|------|-----|----|----|-------|-----|-----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  |
|      | Integer | No   |      |     |    |    |       |     |     |
| 10   | Mult1   | Yes  | Mult | F0  | F2 | F4 |       | Yes | Yes |
|      | Mult2   | No   |      |     |    |    |       |     |     |
| 2    | Add     | Yes  | Sub  | F8  | F6 | F2 |       | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1 | No  | Yes |

*Register result status:*

| Clock | F0 | F2    | F4 | F6 | F8  | F10    | F12 | ... | F30 |
|-------|----|-------|----|----|-----|--------|-----|-----|-----|
| 9     | FU | Mult1 |    |    | Add | Divide |     |     |     |

- **Read operands for MULT & SUB? Issue ADDD?**

# Scoreboard Example: Cycle 10

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    |        |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | Busy | Op   | dest | S1 | S2 | FU    | FU | Fj? | Fk? |
|------|---------|------|------|------|----|----|-------|----|-----|-----|
|      |         |      |      | Fi   | Fj | Fk | Qj    | Qk | Rj  | Rk  |
|      | Integer | No   |      |      |    |    |       |    |     |     |
| 9    | Mult1   | Yes  | Mult | F0   | F2 | F4 |       |    | No  | No  |
|      | Mult2   | No   |      |      |    |    |       |    |     |     |
| 1    | Add     | Yes  | Sub  | F8   | F6 | F2 |       |    | No  | No  |
|      | Divide  | Yes  | Div  | F10  | F0 | F6 | Mult1 |    | No  | Yes |

*Register result status:*

| Clock | F0 | F2    | F4 | F6 | F8  | F10    | F12 | ... | F30 |
|-------|----|-------|----|----|-----|--------|-----|-----|-----|
| 10    | FU | Mult1 |    |    | Add | Divide |     |     |     |

# Scoreboard Example: Cycle 11

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11     |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | Busy | Op   | dest | S1 | S2 | FU    | FU | Fj? | Fk? |
|------|---------|------|------|------|----|----|-------|----|-----|-----|
|      |         |      |      | Fi   | Fj | Fk | Qj    | Qk | Rj  | Rk  |
|      | Integer | No   |      |      |    |    |       |    |     |     |
| 8    | Mult1   | Yes  | Mult | F0   | F2 | F4 |       |    | No  | No  |
|      | Mult2   | No   |      |      |    |    |       |    |     |     |
| 0    | Add     | Yes  | Sub  | F8   | F6 | F2 |       |    | No  | No  |
|      | Divide  | Yes  | Div  | F10  | F0 | F6 | Mult1 |    | No  | Yes |

*Register result status:*

| Clock | F0 | F2    | F4 | F6 | F8  | F10    | F12 | ... | F30 |
|-------|----|-------|----|----|-----|--------|-----|-----|-----|
| 11    | FU | Mult1 |    |    | Add | Divide |     |     |     |

# Scoreboard Example: Cycle 12

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              |      |      |        |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |     |
|------|---------|------|------|-----|----|----|-------|-----|-----|-----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  | Rk  |
|      | Integer | No   |      |     |    |    |       |     |     |     |
| 7    | Mult1   | Yes  | Mult | F0  | F2 | F4 |       |     | No  | No  |
|      | Mult2   | No   |      |     |    |    |       |     |     |     |
|      | Add     | No   |      |     |    |    |       |     |     |     |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1 |     | No  | Yes |

*Register result status:*

|       |    |       |    |    |    |     |     |        |     |
|-------|----|-------|----|----|----|-----|-----|--------|-----|
| Clock | F0 | F2    | F4 | F6 | F8 | F10 | F12 | ...    | F30 |
| 12    | FU | Mult1 |    |    |    |     |     | Divide |     |

- **Read operands for DIVD?**

# Scoreboard Example: Cycle 13

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              | 13   |      |        |

*Functional unit status:*

| Time | Name    | Busy | Op   | dest | S1 | S2 | FU    | FU | Fj? | Fk? |
|------|---------|------|------|------|----|----|-------|----|-----|-----|
|      |         |      |      | Fi   | Fj | Fk | Qj    | Qk | Rj  | Rk  |
|      | Integer | No   |      |      |    |    |       |    |     |     |
| 6    | Mult1   | Yes  | Mult | F0   | F2 | F4 |       |    | No  | No  |
|      | Mult2   | No   |      |      |    |    |       |    |     |     |
|      | Add     | Yes  | Add  | F6   | F8 | F2 |       |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10  | F0 | F6 | Mult1 |    | No  | Yes |

*Register result status:*

| Clock | F0 | F2    | F4 | F6 | F8  | F10 | F12    | ... | F30 |
|-------|----|-------|----|----|-----|-----|--------|-----|-----|
| 13    | FU | Mult1 |    |    | Add |     | Divide |     |     |

# Scoreboard Example: Cycle 14

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              | 13   | 14   |        |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |    |
|------|---------|------|------|-----|----|----|-------|-----|-----|----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  | Rk |
|      | Integer | No   |      |     |    |    |       |     |     |    |
| 5    | Mult1   | Yes  | Mult | F0  | F2 | F4 |       | No  | No  |    |
|      | Mult2   | No   |      |     |    |    |       |     |     |    |
| 2    | Add     | Yes  | Add  | F6  | F8 | F2 |       | Yes | Yes |    |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1 | No  | Yes |    |

*Register result status:*

| Clock | F0 | F2    | F4 | F6 | F8  | F10 | F12    | ... | F30 |
|-------|----|-------|----|----|-----|-----|--------|-----|-----|
| 14    | FU | Mult1 |    |    | Add |     | Divide |     |     |

# Scoreboard Example: Cycle 15

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              | 13   | 14   |        |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |    |
|------|---------|------|------|-----|----|----|-------|-----|-----|----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  | Rk |
|      | Integer | No   |      |     |    |    |       |     |     |    |
| 4    | Mult1   | Yes  | Mult | F0  | F2 | F4 |       | No  | No  |    |
|      | Mult2   | No   |      |     |    |    |       |     |     |    |
| 1    | Add     | Yes  | Add  | F6  | F8 | F2 |       | No  | No  |    |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1 | No  | Yes |    |

*Register result status:*

| Clock | F0 | F2    | F4 | F6 | F8  | F10 | F12    | ... | F30 |
|-------|----|-------|----|----|-----|-----|--------|-----|-----|
| 15    | FU | Mult1 |    |    | Add |     | Divide |     |     |

# Scoreboard Example: Cycle 16

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              | 13   | 14   | 16     |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |    |
|------|---------|------|------|-----|----|----|-------|-----|-----|----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  | Rk |
|      | Integer | No   |      |     |    |    |       |     |     |    |
| 3    | Mult1   | Yes  | Mult | F0  | F2 | F4 |       | No  | No  |    |
|      | Mult2   | No   |      |     |    |    |       |     |     |    |
| 0    | Add     | Yes  | Add  | F6  | F8 | F2 |       | No  | No  |    |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1 | No  | Yes |    |

*Register result status:*

| Clock | F0 | F2    | F4 | F6 | F8  | F10 | F12    | ... | F30 |
|-------|----|-------|----|----|-----|-----|--------|-----|-----|
| 16    | FU | Mult1 |    |    | Add |     | Divide |     |     |

# Scoreboard Example: Cycle 17

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              | 13   | 14   | 16     |

**WAR Hazard!**

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |    |
|------|---------|------|------|-----|----|----|-------|-----|-----|----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  | Rk |
|      | Integer | No   |      |     |    |    |       |     |     |    |
| 2    | Mult1   | Yes  | Mult | F0  | F2 | F4 |       | No  | No  |    |
|      | Mult2   | No   |      |     |    |    |       |     |     |    |
|      | Add     | Yes  | Add  | F6  | F8 | F2 |       | No  | No  |    |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | mult1 | No  | Yes |    |

*Register result status:*

| Clock | F0 | F2    | F4 | F6  | F8 | F10    | F12 | ... | F30 |
|-------|----|-------|----|-----|----|--------|-----|-----|-----|
| 17    | FU | Mult1 |    | Add |    | Divide |     |     |     |

- Why not write result of ADD???

# Scoreboard Example: Cycle 18

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    |        |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              | 13   | 14   | 16     |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |     |
|------|---------|------|------|-----|----|----|-------|-----|-----|-----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  | Rk  |
|      | Integer | No   |      |     |    |    |       |     |     |     |
| 1    | Mult1   | Yes  | Mult | F0  | F2 | F4 |       |     | No  | No  |
|      | Mult2   | No   |      |     |    |    |       |     |     |     |
|      | Add     | Yes  | Add  | F6  | F8 | F2 |       |     | No  | No  |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1 |     | No  | Yes |

*Register result status:*

| Clock | F0 | F2    | F4 | F6  | F8 | F10    | F12 | ... | F30 |
|-------|----|-------|----|-----|----|--------|-----|-----|-----|
| 18    | FU | Mult1 |    | Add |    | Divide |     |     |     |

# Scoreboard Example: Cycle 19

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    | 19     |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    |      |        |
| ADDD        | F6  | F8  | F2              | 13   | 14   | 16     |

*Functional unit status:*

| Time | Name    | dest |      | S1  | S2 | FU | FU    | Fj? | Fk? |    |
|------|---------|------|------|-----|----|----|-------|-----|-----|----|
|      |         | Busy | Op   | Fi  | Fj | Fk | Qj    | Qk  | Rj  | Rk |
|      | Integer | No   |      |     |    |    |       |     |     |    |
| 0    | Mult1   | Yes  | Mult | F0  | F2 | F4 |       | No  | No  |    |
|      | Mult2   | No   |      |     |    |    |       |     |     |    |
|      | Add     | Yes  | Add  | F6  | F8 | F2 |       | No  | No  |    |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | Mult1 | No  | Yes |    |

*Register result status:*

| Clock | F0 | F2    | F4 | F6  | F8 | F10    | F12 | ... | F30 |
|-------|----|-------|----|-----|----|--------|-----|-----|-----|
| 19    | FU | Mult1 |    | Add |    | Divide |     |     |     |

# Scoreboard Example: Cycle 20

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |    |
|-------------|-----|-----|-----------------|------|------|--------|----|
|             |     |     | Issue           | Oper | Comp | Result |    |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3      | 4  |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7      | 8  |
| MULTD       | F0  | F2  | F4              | 6    | 9    | 19     | 20 |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11     | 12 |
| DIVD        | F10 | F0  | F6              | 8    |      |        |    |
| ADDD        | F6  | F8  | F2              | 13   | 14   | 16     |    |

*Functional unit status:*

| Time | Name    | dest |     | S1  | S2 | FU | FU | Fj? | Fk? |
|------|---------|------|-----|-----|----|----|----|-----|-----|
|      |         | Busy | Op  | Fi  | Fj | Fk | Qj | Qk  | Rj  |
|      | Integer | No   |     |     |    |    |    |     |     |
|      | Mult1   | No   |     |     |    |    |    |     |     |
|      | Mult2   | No   |     |     |    |    |    |     |     |
|      | Add     | Yes  | Add | F6  | F8 | F2 |    | No  | No  |
|      | Divide  | Yes  | Div | F10 | F0 | F6 |    | Yes | Yes |

*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8  | F10 | F12    | ... | F30 |
|-------|----|----|----|----|-----|-----|--------|-----|-----|
| 20    | FU |    |    |    | Add |     | Divide |     |     |

# Scoreboard Example: Cycle 21

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |
|-------------|-----|-----|-----------------|------|------|--------|
|             |     |     | Issue           | Oper | Comp | Result |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4              | 6    | 9    | 19 20  |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6              | 8    | 21   |        |
| ADDD        | F6  | F8  | F2              | 13   | 14   | 16     |

*Functional unit status:*

| Time | Name    | dest |     | S1  | S2 | FU | FU | Fj? | Fk? |
|------|---------|------|-----|-----|----|----|----|-----|-----|
|      |         | Busy | Op  | Fi  | Fj | Fk | Qj | Qk  | Rj  |
|      | Integer | No   |     |     |    |    |    |     |     |
|      | Mult1   | No   |     |     |    |    |    |     |     |
|      | Mult2   | No   |     |     |    |    |    |     |     |
|      | Add     | Yes  | Add | F6  | F8 | F2 |    | No  | No  |
|      | Divide  | Yes  | Div | F10 | F0 | F6 |    | Yes | Yes |

*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8  | F10 | F12    | ... | F30 |
|-------|----|----|----|----|-----|-----|--------|-----|-----|
| 21    | FU |    |    |    | Add |     | Divide |     |     |

- WAR Hazard is now gone...

# Scoreboard Example: Cycle 22

*Instruction status:*

| Instruction | j   | k   | Issue | Read | Exec | Write  |
|-------------|-----|-----|-------|------|------|--------|
|             |     |     |       | Oper | Comp | Result |
| LD          | F6  | 34+ | R2    | 1    | 2    | 3 4    |
| LD          | F2  | 45+ | R3    | 5    | 6    | 7 8    |
| MULTD       | F0  | F2  | F4    | 6    | 9    | 19 20  |
| SUBD        | F8  | F6  | F2    | 7    | 9    | 11 12  |
| DIVD        | F10 | F0  | F6    | 8    | 21   |        |
| ADDD        | F6  | F8  | F2    | 13   | 14   | 16 22  |

*Functional unit status:*

| Time | Name    | Busy | Op  | dest | S1 | S2 | FU | FU | Fj? | Fk? |
|------|---------|------|-----|------|----|----|----|----|-----|-----|
|      |         |      |     | Fi   | Fj | Fk | Qj | Qk | Rj  | Rk  |
|      | Integer | No   |     |      |    |    |    |    |     |     |
|      | Mult1   | No   |     |      |    |    |    |    |     |     |
|      | Mult2   | No   |     |      |    |    |    |    |     |     |
|      | Add     | No   |     |      |    |    |    |    |     |     |
| 39   | Divide  | Yes  | Div | F10  | F0 | F6 |    |    | No  | No  |

*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ... | F30    |
|-------|----|----|----|----|----|-----|-----|-----|--------|
| 22    | FU |    |    |    |    |     |     |     | Divide |

**Faster than light computation  
(skip a couple of cycles)**

---

# Scoreboard Example: Cycle 61

*Instruction status:*

| Instruction | j   | k   | Read Exec Write |      |      |        |    |
|-------------|-----|-----|-----------------|------|------|--------|----|
|             |     |     | Issue           | Oper | Comp | Result |    |
| LD          | F6  | 34+ | R2              | 1    | 2    | 3      | 4  |
| LD          | F2  | 45+ | R3              | 5    | 6    | 7      | 8  |
| MULTD       | F0  | F2  | F4              | 6    | 9    | 19     | 20 |
| SUBD        | F8  | F6  | F2              | 7    | 9    | 11     | 12 |
| DIVD        | F10 | F0  | F6              | 8    | 21   | 61     |    |
| ADDD        | F6  | F8  | F2              | 13   | 14   | 16     | 22 |

*Functional unit status:*

| Time | Name    | dest |     | S1  | S2 | FU | FU | Fj? | Fk? |
|------|---------|------|-----|-----|----|----|----|-----|-----|
|      |         | Busy | Op  | Fi  | Fj | Fk | Qj | Qk  | Rj  |
|      | Integer | No   |     |     |    |    |    |     |     |
|      | Mult1   | No   |     |     |    |    |    |     |     |
|      | Mult2   | No   |     |     |    |    |    |     |     |
|      | Add     | No   |     |     |    |    |    |     |     |
| 0    | Divide  | Yes  | Div | F10 | F0 | F6 |    | No  | No  |

*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8 | F10 | F12    | ... | F30 |
|-------|----|----|----|----|----|-----|--------|-----|-----|
| 61    | FU |    |    |    |    |     | Divide |     |     |

# Scoreboard Example: Cycle 62

*Instruction status:*

| Instruction | j   | k   | Issue | Read | Exec | Write  |
|-------------|-----|-----|-------|------|------|--------|
|             |     |     |       | Oper | Comp | Result |
| LD          | F6  | 34+ | R2    | 1    | 2    | 3      |
| LD          | F2  | 45+ | R3    | 5    | 6    | 7      |
| MULTD       | F0  | F2  | F4    | 6    | 9    | 19     |
| SUBD        | F8  | F6  | F2    | 7    | 9    | 11     |
| DIVD        | F10 | F0  | F6    | 8    | 21   | 61     |
| ADDD        | F6  | F8  | F2    | 13   | 14   | 16     |

*Functional unit status:*

| Time | Name    | Busy | dest | S1 | S2 | FU | FU | Fj? | Fk? |
|------|---------|------|------|----|----|----|----|-----|-----|
|      |         |      | Op   | Fi | Fj | Fk | Qj | Qk  | Rj  |
|      | Integer | No   |      |    |    |    |    |     |     |
|      | Mult1   | No   |      |    |    |    |    |     |     |
|      | Mult2   | No   |      |    |    |    |    |     |     |
|      | Add     | No   |      |    |    |    |    |     |     |
|      | Divide  | No   |      |    |    |    |    |     |     |

*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|----|----|----|----|-----|-----|-----|-----|
| 62    | FU |    |    |    |    |     |     |     |     |

# Review: Scoreboard Example: Cycle 62

*Instruction status:*

| Instruction | j   | k   | Issue | Read | Exec | Write  |
|-------------|-----|-----|-------|------|------|--------|
|             |     |     |       | Oper | Comp | Result |
| LD          | F6  | 34+ | R2    | 1    | 2    | 3      |
| LD          | F2  | 45+ | R3    | 5    | 6    | 7      |
| MULTD       | F0  | F2  | F4    | 6    | 9    | 19     |
| SUBD        | F8  | F6  | F2    | 7    | 9    | 11     |
| DIVD        | F10 | F0  | F6    | 8    | 21   | 61     |
| ADDD        | F6  | F8  | F2    | 13   | 14   | 16     |

*Functional unit status:*

| Time | Name    | dest | S1 | S2 | FU | FU | Fj? | Fk? |    |    |
|------|---------|------|----|----|----|----|-----|-----|----|----|
|      |         | Busy | Op | Fi | Fj | Fk | Qj  | Qk  | Rj | Rk |
|      | Integer | No   |    |    |    |    |     |     |    |    |
|      | Mult1   | No   |    |    |    |    |     |     |    |    |
|      | Mult2   | No   |    |    |    |    |     |     |    |    |
|      | Add     | No   |    |    |    |    |     |     |    |    |
|      | Divide  | No   |    |    |    |    |     |     |    |    |

*Register result status:*

| Clock | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ... | F30 |
|-------|----|----|----|----|----|-----|-----|-----|-----|
| 62    | FU |    |    |    |    |     |     |     |     |

- In-order issue; out-of-order execute & commit

# CDC 6600 Scoreboard

---

- Speedup 1.7 from compiler; 2.5 by hand  
BUT slow memory (no cache) limits benefit
- Limitations of 6600 scoreboard:
  - No forwarding hardware
  - Limited to instructions in basic block (small *window*)
  - Small number of functional units (structural hazards), especially integer/load store units
  - Do not issue on structural hazards
  - Wait for WAR hazards
  - Prevent WAW hazards

# A Useful Slide

---

- **Interesting Resource: <http://bitsavers.org>**
  - Has digital versions of users manuals for old machines
  - Quite interesting!
  - I'll link in some of them to your reading pages when it is appropriate
  - Very limited bandwidth: use mirrors such as:  
<http://bitsavers.vt100.net>