

**Lecture 6:**

# **Arithmetic 2/3**

Assignment Project Exam Help

<https://tutorcs.com>

WeChat: cstutorcs

---

**Introduction to Computer Architecture**  
**UC Davis EEC 170, Fall 2019**

# Multiply (unsigned)

## ■ Paper and pencil example (unsigned):

Multiplicand    1000

Multiplier     $\times \underline{1001}$   
                    1000

0000

0000 Assignment Project Exam Help

1000

Product     $\underline{01001000}$  https://tutorcs.com

## ■ $m$ bits $\times$ $n$ bits = $m+n$ bit product WeChat: cstutorcs

## ■ Binary makes it easy:

- 0 => place 0 ( 0 x multiplicand )
- 1 => place a copy ( 1 x multiplicand )

## ■ 4 versions of multiply hardware & algorithm:

- successive refinement

# Unsigned Combinational Multiplier



# How does it work?



- At each stage shift A left (multiply it by 2)
- Use next bit of B to determine whether to add in shifted multiplicand
- Accumulate 2n bit partial product at each stage

# Unsigned Combinational Multiplier



- Stage  $i$  accumulates  $A * 2^i$  if  $B_i == 1$
- Q: How much hardware for 32 bit multiplier? Critical path?

# Unsigned shift-add multiplier (version 1)

- 64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg, 32-bit multiplier reg



**Multiplier = datapath + control**

# Multiply Algorithm V1

Product    Multiplier    Multiplicand

0000 0000    0011    0000 0010

1:0000 0010    0011    0000 0010

2:0000 0010    0011    0000 0100

3:0000 0010    0001    0000 0100

1:0000 0110    0001    0000 0100

2:0000 0110    0001    0000 1000

3:0000 0110    0000    0000 1000

0000 0110    0000    0000 1000



# Observations on Multiply Version 1

- 1 clock per cycle =>  $\approx 100$  clocks per multiply
  - Ratio of multiply to add 5:1 to 100:1
- 1/2 bits in multiplicand always 0  
=> 64-bit adder is wasted
  - Assignment Project Exam Help  
<https://tutorcs.com>
- 0's inserted in right of multiplicand as shifted  
=> least significant bits of product never changed once formed
  - WeChat: cstutorcs
- *Instead of shifting multiplicand to left, shift product to right?*

# Multiply Hardware Version 2

- 32-bit Multiplicand reg, 32-bit ALU, 64-bit Product reg, 32-bit Multiplier reg

Assignment Project Exam Help



# How to think of this?

- Remember original combinational multiplier:



# Simply warp to let product move right...



- Multiplicand stays still and product moves right

# Multiply Algorithm V2

|    | <b>Product</b> | <b>Multiplier</b> | <b>Multiplicand</b> |
|----|----------------|-------------------|---------------------|
|    | 0000 0000      | 0011              | 0010                |
| 1: | 0010 0000      | 0011              | 0010                |
| 2: | 0001 0000      | 0011              | 0010                |
| 3: | 0001 0000      | 0001              | 0010                |
| 1: | 0011 0000      | 0001              | 0010                |
| 2: | 0001 1000      | 0001              | 0010                |
| 3: | 0001 1000      | 0000              | 0010                |
| 1: | 0001 1000      | 0000              | 0010                |
| 2: | 0000 1100      | 0000              | 0010                |
| 3: | 0000 1100      | 0000              | 0010                |
| 1: | 0000 1100      | 0000              | 0010                |
| 2: | 0000 0110      | 0000              | 0010                |
| 3: | 0000 0110      | 0000              | 0010                |
|    | 0000 0110      | 0000              | 0010                |

1a. Add multiplicand to the left half of product & place the result in the left half of Product register

<https://tutorcs.com>

WeChat: cstusofts

2. Shift the Product register right 1 bit.

3. Shift the Multiplier register right 1 bit.



No: < 32 repetitions

Yes: 32 repetitions

Done

# Still more wasted space!



# Observations on Multiply Version 2

- Product register wastes space that exactly matches size of multiplier  
=> combine Multiplier register and Product register

Assignment Project Exam Help

<https://tutorcs.com>

WeChat: cstutorcs

# Multiply Hardware Version 3

- 32-bit Multiplicand reg, 32-bit ALU, 64-bit Product reg, (0-bit Multiplier reg)



# Multiply Algorithm V3



# Observations on Multiply Version 3

- 2 steps per bit because Multiplier & Product combined
- What about signed multiplication?
  - easiest solution is to make both positive & remember whether to complement product when done (leave out the sign bit, run for 31 steps)  
Assignment Project Exam Help  
https://tutorcs.com  
WeChat: cstutorcs
  - apply definition of 2's complement
    - need to sign-extend partial products and subtract at the end
  - Booth's Algorithm is elegant way to multiply signed numbers using same hardware as before and save cycles
    - can handle multiple bits at a time

# Faster Multiplier

- Uses multiple adders
  - Cost/performance tradeoff



- Can be pipelined
  - Several multiplications performed in parallel

# Motivation for Booth's Algorithm

- Example  $2 \times 6 = 0010 \times 0110$ :

|       |                |                         |
|-------|----------------|-------------------------|
| ■     | $0010$         |                         |
|       | $x \quad 0110$ |                         |
| <hr/> |                |                         |
| +     | $0000$         | shift (0 in multiplier) |
| +     | $0010$         | add (1 in multiplier)   |
| +     | $0010$         | add (1 in multiplier)   |
| +     | $0000$         | shift (0 in multiplier) |
| <hr/> |                |                         |
|       | $0001100$      |                         |

# Motivation for Booth's Algorithm

- ALU with add or subtract can get same result in more than one way:

- $6 = 4 + 2 = -2 + 8$

$$0110 = -00010 + 01000 = 11110 + 01000$$

- For example: Assignment Project Exam Help

|                                                                                                                         |                                                                                                                                                                      |
|-------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| $\begin{array}{r} 0010 \\ \times 0110 \\ \hline 0000 \\ - 0010 \\ \hline 0000 \\ + 0010 \\ \hline 00001100 \end{array}$ | <p>(2 [multiplier])<br/>(6 [multiplicand])<br/>shift (0 in multiplier)<br/>sub (first 1 in multpl.)<br/>shift (mid string of 1s)<br/>add (prior step had last 1)</p> |
|-------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|

# Booth's Algorithm



- | ■ Current Bit | Bit to the Right | Explanation         | Example    | Op   |
|---------------|------------------|---------------------|------------|------|
| 1             | 0                | Begins run of 1s    | 0001111000 | sub  |
| 1             | 1                | Middle of run of 1s | 0001111000 | none |
| 0             | 1                | End of run of 1s    | 0001111000 | add  |
| 0             | 0                | Middle of run of 0s | 0001111000 | none |
- Assignment Project Exam Help**  
<https://cstutorcs.com>  
WeChat: cstutorcs
- Originally for speed (when shift was faster than add)
  - Replace a string of 1s in multiplier with an initial subtract when we first see a one and then later add for the bit after the last one
  - Handles two's complement!

# Booth's Example (2 x 7)

| Operation        | Multiplicand  | Product    | next?              |
|------------------|---------------|------------|--------------------|
| 0. initial value | 0010          | 0000 01110 | 10 -> sub          |
| 1a. $P = P - m$  | 0010 + 1110-> | 1110 01110 | shift P (sign ext) |
| 1b.              | 0010          | 1111 00111 | 11 -> nop, shift   |
| 2.               | 0010          | 1111 10011 | 11 -> nop, shift   |
| 3.               | 0010          | 1111 11001 | 01 -> add          |
| 4a.              | 0010 + 0010-> | 0001 11001 | shift              |
| 4b.              | 0010          | 0000 11100 | done               |

Blue + red = multiplier (red is 2 bits to compare); Green = arithmetic ops; Black = product

# Booth's Example (2 x -3)

| Operation        | Multiplicand   | Product                                                                               | next?              |
|------------------|----------------|---------------------------------------------------------------------------------------|--------------------|
| 0. initial value | 0010           | 0000 1101 0                                                                           | 10 -> sub          |
| 1a. $P = P - m$  | 1110 + 1110    | 1110 1101 0                                                                           | shift P (sign ext) |
| 1b.              | 0010<br>+ 0010 | 1111 0110 1                                                                           | 01 -> add          |
| 2a.              |                | Assignment Project Exam Help<br><a href="https://tutorcs.com">https://tutorcs.com</a> | shift P            |
| 2b.              | 0010<br>+ 1110 | WeChat: cstutorcs<br>0000 1011 0                                                      | 10 -> sub          |
| 3a.              | 0010           | 1110 1011 0                                                                           | shift              |
| 3b.              | 0010           | 1111 0101 1                                                                           | 11 -> nop          |
| 4a.              |                | 1111 0101 1                                                                           | shift              |
| 4b.              | 0010           | 1111 1010 1                                                                           | done               |

Blue + red = multiplier (red is 2 bits to compare); Green = arithmetic ops; Black = product

# Radix-4 Modified Booth's Algorithm

| Current Bits                                          | Bit to the Right | Explanation                                            | Example               | Recode |
|-------------------------------------------------------|------------------|--------------------------------------------------------|-----------------------|--------|
| 0 0                                                   | 0                | Middle of zeros                                        | 00 00 00 <u>00</u> 00 | 0      |
| 0 1                                                   | 0                | Single one                                             | 00 00 00 <u>01</u> 00 | 1      |
| 1 0                                                   | 0                | Begins run of 1s                                       | 00 01 11 <u>10</u> 00 | -2     |
| 1 1                                                   | 0                | Assignment Project Exam Help<br>Begins run of 1s       |                       |        |
|                                                       |                  |                                                        | 00 01 11 <u>11</u> 00 | -1     |
| <a href="https://tutorcs.com">https://tutorcs.com</a> |                  |                                                        |                       |        |
| 0 0                                                   | 1                | We <del>Ends run of 1s</del> <del>ends run of 1s</del> | 00 <u>00</u> 11 11 00 | 1      |
| 0 1                                                   | 1                | Ends run of 1s                                         | 00 <u>01</u> 11 11 00 | 2      |
| 1 0                                                   | 1                | Isolated 0                                             | 00 11 <u>10</u> 11 00 | -1     |
| 1 1                                                   | 1                | Middle of run                                          | 00 11 <u>11</u> 11 00 | 0      |

Same insight as one-bit Booth's, simply adjust for alignment of 2 bits.

Allows multiplication 2 bits at a time.

# RISC-V Multiplication Support

- Four multiply instructions:
  - **mul: multiply**
    - Gives the lower 64 bits of the product
  - **mulh: multiply high**
    - Gives the upper 64 bits of the product, assuming the operands are signed
  - **mulhu: multiply high unsigned**
    - Gives the upper 64 bits of the product, assuming the operands are unsigned
  - **mulhsu: multiply high signed/unsigned**
    - Gives the upper 64 bits of the product, assuming one operand is signed and the other unsigned
  - Use mulh result to check for 64-bit overflow

<https://tutorcs.com>

WeChat: cstutorcs

# RISC-V Support for multiply

- “If both the high and low bits of the same product are required, then the recommended code sequence is: MULH[[S]U] rdh , rs1 , rs2; MUL rdl , rs1 , rs2 (source register specifiers must be in same order and rdh cannot be the same as rs1 or rs2). Microarchitectures can then fuse these into a single multiply operation instead of performing two separate multiplies.”

WeChat: cstutorcs

# Multiplication Summary

- Iterative algorithm
- Design techniques:
  - Analyze hardware—what's not in use?
  - Spend more hardware to get higher performance  
Assignment Project Exam Help
  - Booth's Algorithm—more general (2's complement)  
<https://tutorcs.com>
  - Booth's Algorithm—~~WeChat: cstu~~ coding is powerful technique to think about problem in a different way
  - Booth's Algorithm—more bits at once gives higher performance

# RISC-V Support for divide

- 4 instructions:
  - {div, divu, rem, remu} rd, rs1, rs2
  - div: rs1 / rs2, treat as signed
  - divu: rs1 / rs2, treat as unsigned
  - rem: rs1 mod rs2, treat as signed  
Assignment Project Exam Help
  - remu: rs1 mod rs2, treat as unsigned  
<https://tutorcs.com>
- “If both the quotient and remainder are required from the same division, the recommended code sequence is: DIV[U] rdq, rs1, rs2; REM[U] rdr, rs1, rs2 (rdq cannot be the same as rs1 or rs2). Microarchitectures can then fuse these into a single divide operation instead of performing two separate divides.”  
WeChat: cstutorcs
- Overflow and division-by-zero don’t produce errors
  - Just return defined results
  - Faster for the common case of no error

# MIPS Support for multiply/divide

- Rather than target the general-purpose registers:
  - mul placed its output into two special hi and lo registers
  - div placed its divide output into lo and its rem output into hi
  - MIPS provided mflo and mfhi instructions (destination: general-purpose register)  
<https://tutorcs.com>  
WeChat: cstutorcs

Assignment Project Exam Help

# Divide: Paper & Pencil

Divisor      1000      
$$\begin{array}{r} 1001 \text{ Quotient} \\ \hline 1001010 \text{ Dividend} \\ -1000 \\ \hline 10 \\ 101 \\ 1010 \\ \hline \end{array}$$

Assignment Project Exam Help

10      Remainder  
<https://tutorcs.com> (or Modulo result)

WeChat: cstutorcs

- See how big a number can be subtracted, creating quotient bit on each step
  - Binary  $\Rightarrow 1 * \text{divisor or } 0 * \text{divisor}$
- $\text{Dividend} = \text{Quotient} \times \text{Divisor} + \text{Remainder}$
- 3 versions of divide, successive refinement

# Divide Hardware Version 1

- 64-bit Divisor reg, 64-bit ALU, 64-bit Remainder reg, 32-bit Quotient reg



# Divide Algorithm V1

- Takes  $n+1$  steps for  $n$ -bit Quotient & Rem.
- Remainder    Quotient              Divisor

0000 0111    0000

0010 0000

Remainder  $\geq 0$

2a. Shift the Quotient register to the left setting the new rightmost bit to 1.

2b. Restore the original value by adding the Divisor register to the Remainder register, & place the sum in the Remainder register. Also shift the Quotient register to the left, setting the new least significant bit to 0.  
"restoring" division

3. Shift the Divisor register right1 bit.

n+1 repetition?

No:  $< n+1$  repetitions

Yes:  $n+1$  repetitions ( $n = 4$  here)

Done

Start: Place Dividend in Remainder



1. Subtract the Divisor register from the Remainder register, and place the result in the Remainder register.

Test  
Remainder

Remainder  $< 0$



# Divide Algorithm I example (7 / 2)

|         | Remainder | Quotient | Divisor                      |
|---------|-----------|----------|------------------------------|
| 0000    | 0111      | 00000    | 0010 0000                    |
| 1: 1110 | 0111      | 00000    | 0010 0000                    |
| 2: 0000 | 0111      | 00000    | 0010 0000                    |
| 3: 0000 | 0111      | 00000    | 0001 0000                    |
| 1: 1111 | 0111      | 00000    | 0001 0000                    |
| 2: 0000 | 0111      | 00000    | Assignment Project Exam Help |
| 3: 0000 | 0111      | 00000    | 0000 1000                    |
| 1: 1111 | 1111      | 00000    | https://tutorcs.com          |
| 2: 0000 | 0111      | 00000    | WeChat csutorcs              |
| 3: 0000 | 0111      | 00000    | 0000 0100                    |
| 1: 0000 | 0011      | 00000    | 0000 0100                    |
| 2: 0000 | 0011      | 00001    | 0000 0100                    |
| 3: 0000 | 0011      | 00001    | 0000 0010                    |
| 1: 0000 | 0001      | 00001    | 0000 0010                    |
| 2: 0000 | 0001      | 00011    | 0000 0010                    |
| 3: 0000 | 0001      | 00011    | 0000 0001                    |

Answer:  
Quotient = 3  
Remainder = 1

# Divide: Paper & Pencil

Divisor 0001    
$$\begin{array}{r} 01010 \\ \hline 00001010 \\ 00001 \\ \hline -0001 \\ 0000 \\ 0001 \\ \hline -0001 \\ 0 \end{array}$$

Quotient  
Dividend  
Assignment Project Exam Help  
<https://tutorcs.com>  
00    Remainder  
(or ModWeChat:tutorcs)

- No way to get a 1 in leading digit!
  - (this is an overflow, i.e quotient would have n+1 bits)
  - $\Rightarrow$  switch order to shift first and then subtract, can save 1 iteration

# Observations on Divide Version 1

- **1/2 bits in divisor always 0**  
=> **1/2 of 64-bit adder is wasted**  
=> **1/2 of divisor is wasted**
- **Instead of shifting divisor to right, shift remainder to left?**

Assignment Project Exam Help

<https://tutorcs.com>

WeChat: cstutorcs

# Divide Algorithm I example: wasted space

| Remainder   | Quotient | Divisor                                                            |
|-------------|----------|--------------------------------------------------------------------|
| 0000 0111   | 00000    | 0010 0000                                                          |
| 1:1110 0111 | 00000    | 0010 0000                                                          |
| 2:0000 0111 | 00000    | 0010 0000                                                          |
| 3:0000 0111 | 00000    | 0001 0000                                                          |
| 1:1111 0111 | 00000    | 0001 0000                                                          |
| 2:0000 0111 | 00000    | Assignment Project Exam Help                                       |
| 3:0000 0111 | 00000    | 0000 1000<br><a href="https://tutorcs.com">https://tutorcs.com</a> |
| 1:1111 1111 | 00000    | 0000 1000                                                          |
| 2:0000 0111 | 00000    | WeChat: cstutorcs                                                  |
| 3:0000 0111 | 00000    | 0000 0100                                                          |
| 1:0000 0011 | 00000    | 0000 0100                                                          |
| 2:0000 0011 | 00001    | 0000 0100                                                          |
| 3:0000 0011 | 00001    | 0000 0010                                                          |
| 1:0000 0001 | 00001    | 0000 0010                                                          |
| 2:0000 0001 | 00011    | 0000 0010                                                          |
| 3:0000 0001 | 00011    | 0000 0010                                                          |

# Divide Hardware Version 2

- 32-bit Divisor reg, 32-bit ALU, 64-bit Remainder reg, 32-bit Quotient reg



# Divide Algorithm V2

| Remainder | Quotient | Divisor   |
|-----------|----------|-----------|
| 0000      | 0111     | 0000 0010 |

**Start: Place Dividend in Remainder**

**1. Shift the Remainder register left 1 bit.**

**2. Subtract the Divisor register from the left half of the Remainder register, & place the result in the left half of the Remainder register.**

Remainder  $\geq 0$

Remainder  $< 0$

Assignment Project Exam Help

<https://tutorecs.com>

WeChat: cstutorecs  
**3b. Restore the original value by adding the Divisor register to the left half of the Remainder register, & place the sum in the left half of the Remainder register. Also shift the Quotient register to the left, setting the new least significant bit to 1.**

**3a. Shift the Quotient register to the left setting the new rightmost bit to 1.**

No: n repetitions

Yes: n repetitions (n = 4 here)

Done

# Observations on Divide Version 2

- Eliminate Quotient register by combining with Remainder as shifted left
  - Start by shifting the Remainder left as before.
  - Thereafter loop contains only two steps because the shifting of the Remainder register shifts both the remainder in the left half and the quotient in the right half  
Assignment Project Exam Help  
<https://tutorcs.com>  
WeChat: cstutorcs
  - The consequence of combining the two registers together and the new order of the operations in the loop is that the remainder will shifted left one time too many.
  - Thus the final correction step must shift back only the remainder in the left half of the register

# Divide Hardware Version 3

- 32-bit Divisor reg, 32-bit ALU, 64-bit Remainder reg, (0-bit Quotient reg)



# Divide Algorithm V3

Remainder      Divisor  
0000 0111      0010

Start: Place Dividend in Remainder

1. Shift the Remainder register left 1 bit.

2. Subtract the Divisor register from the left half of the Remainder register, & place the result in the left half of the Remainder register.

Remainder  $\geq 0$

Remainder  $< 0$

Assignment Project Exam Help

<https://tutores.com>  
WeChat: cstutorcs

3a. Shift the Remainder register to the left setting the new rightmost bit to 1.

3b. Restore the original value by adding the Divisor register to the left half of the Remainder register, & place the sum in the left half of the Remainder register. Also shift the Remainder register to the left, setting the new least significant bit to 0.

nth repetition?

No:  $< n$  repetitions

Yes:  $n$  repetitions ( $n = 4$  here)

Done. Shift left half of Remainder right 1 bit.

# Final Multiply / Divide Hardware



# Observations on Divide Version 3

- Same Hardware as Multiply: just need ALU to add or subtract, and 64-bit register to shift left or shift right
- Hi and Lo registers in MIPS combine to act as 64-bit register for multiply and divide
- Signed Divides: Simplest is to remember signs, make positive, and complement quotient and remainder if necessary  
Assignment Project Exam Help  
<https://tutorcs.com>
  - Note: Dividend and Remainder must have same sign  
WeChat: cstutorcs
  - Note: Quotient negated if Divisor sign & Dividend sign disagree
    - e.g.,  $-7 \div 2 = -3$ , remainder =  $-1$
    - What about?  $-7 \div 2 = -4$ , remainder =  $+1$
    - See <http://mathforum.org/library/drmath/view/52343.html>
- Possible for quotient to be too large: if divide 64-bit integer by 1, quotient is 64 bits (called “saturation”)

# SRT Division

D. Sweeney of IBM, J.E. Robertson of the University of Illinois,  
and T.D. Tocher of Imperial College, London

Current  
Remainder

P-D (Partial  
Divisor) Plot



# SRT Division

- Intel Pentium divide implementation: SRT division with 2 bits/iteration (radix 4)
- Allows negative entries
- 1066 entries in lookup table

Assignment Project Exam Help

<https://tutorcs.com>

WeChat: cstutorcs

[<http://members.cox.net/srice1/pentbug/introduction.html>]



# Faster Division

- Can't use parallel hardware as in multiplier
  - Subtraction is conditional on sign of remainder
- Faster dividers (e.g. SRT devision) generate multiple quotient bits per step

Assignment Project Exam Help

- Still require multiple steps

<https://tutorcs.com>

WeChat: cstutorcs

# Division Lessons

- In practice, slower than multiplication
  - Also less frequent
  - But, in the simple case, can use same hardware!
- Generates quotient and remainder together  
Assignment Project Exam Help
- Floating-point division faster than integer division (why?)  
<https://tutores.com>
- Similar hardware lessons  
We can multiply:
  - Look for unused hardware
  - Can process multiple bits at once at cost of extra hardware

# RISC-V logical instructions

| Instruction     | Meaning                          | Pseudocode                             |
|-----------------|----------------------------------|----------------------------------------|
| XORI rd,rs1,imm | Exclusive Or Immediate           | $rd \leftarrow ux(rs1) \oplus ux(imm)$ |
| ORI rd,rs1,imm  | Or Immediate                     | $rd \leftarrow ux(rs1) \vee ux(imm)$   |
| ANDI rd,rs1,imm | And Immediate                    | $rd \leftarrow ux(rs1) \wedge ux(imm)$ |
| SLLI rd,rs1,imm | Shift Left Logical Immediate     | $rd \leftarrow ux(rs1) \ll ux(imm)$    |
| SRLI rd,rs1,imm | Shift Right Logical Immediate    | $rd \leftarrow ux(rs1) \gg ux(imm)$    |
| SRAI rd,rs1,imm | Shift Right Arithmetic Immediate | $rd \leftarrow sx(rs1) \gg ux(imm)$    |
| SLL rd,rs1,rs2  | Shift Left Logical               | $rd \leftarrow ux(rs1) \ll rs2$        |
| XOR rd,rs1,rs2  | Exclusive Or                     | $rd \leftarrow ux(rs1) \oplus ux(rs2)$ |
| SRL rd,rs1,rs2  | Shift Right Logical              | $rd \leftarrow ux(rs1) \gg rs2$        |
| SRA rd,rs1,rs2  | Shift Right Arithmetic           | $rd \leftarrow sx(rs1) \gg rs2$        |
| OR rd,rs1,rs2   | Or                               | $rd \leftarrow ux(rs1) \vee ux(rs2)$   |
| AND rd,rs1,rs2  | And                              | $rd \leftarrow ux(rs1) \wedge ux(rs2)$ |

Assignment Project Exam Help

<https://cstutorcs.com>

WeChat: cstutorcs

# Shift Operations

## ■ Bit manipulation:

- S EEEEEEEE MMMMMMMMMMMMMMMMMMMMM  
& 0 11111111 0000000000000000000000000000

-----

0 EEEEEEEE 0000000000000000000000000000

- Right shift 23 bits to get

0000000000000000000000000000 EEEEEEEE

- Do arithmetic manipulation

0000000000000000000000000000 ENEWNEW

- Left shift 23 bits to get

0 ENEWNEW 0000000000000000000000000000

# Shift Operations

## ■ Arithmetic operation:

- Example:  $00011 \ll 2$  [3 left shift 2]
  - $00011 \ll 2 = 01100 = 12 = 2 * 4$
- Each bit shifted left == multiply by two  
Assignment Project Exam Help  
<https://tutorcs.com>
- Example:  $01010 \gg 1$  [10 right shift 1]
  - $01010 \gg 1 = 00101 = 5 = 10/2$
- Each bit shifted right == divide by two
- Why?
- Compilers do this—“strength reduction”

# Shift Operations

## ■ With left shift, what do we shift in?

- $00011 << 2 = 01100$  (arithmetic)
- $0000XXXX << 4 = XXXX0000$  (logical)
- We shifted in zeroes

Assignment Project Exam Help

## ■ How about right shift?

<https://tutorcs.com>

WeChat: cstutorcs

- $XXXX0000 >> 4 = 0000XXXX$  (logical)
  - Shifted in zero
- $00110 (= 6) >> 1 = 00011 (3)$  (arithmetic)
  - Shifted in zero
- $11110 (= -2) >> 1 = 11111 (-1)$  (arithmetic)
  - Shifted in one

# Shift Operations

## ■ How about right shift?

- **XXXX0000 >> 4 = 0000XXXX: Logical shift**
  - Shifted in zero
- **00110 (= 6) >> 1 = 00011 (3)**  
**11110 (= -2) >> 1 = 11111 (-1): Arithmetic shift**

- Shifted in sign bit

## ■ RISC-V supports both logical and arithmetic:

<https://tutorcs.com>

- **slli, srai, srli: Shift amount taken from within instruction ("imm")**

| funct6 | shamt  | rs1    | funct3 | rd     | opcode |
|--------|--------|--------|--------|--------|--------|
| 6 bits | 6 bits | 5 bits | 3 bits | 5 bits | 7 bits |

| funct7 | rs2    | rs1    | funct3 | rd     | opcode |
|--------|--------|--------|--------|--------|--------|
| 7 bits | 5 bits | 5 bits | 3 bits | 5 bits | 7 bits |

- **sll, sra, srli: shift amount taken from register ("variable")**
- **How far can we shift with slli/srai/slli? With sll/sra/srli?**

# Combinational Shifter from MUXes

## Basic Building Block



- What comes in the MSBs?
- How many levels for 64-bit shifter?
- What if we use 4-1 Muxes ?

# General Shift Right Scheme using 16 bit example



- If we added right-to-left connections, we could support ROTATE (not in RISC-V but found in other ISAs)

# Funnel Shifter

- Shift A by i bits



- Problem: Set Y, X, sa

- Logical:

- Arithmetic:

- Rotate:

- Left shifts:



# Barrel Shifter

- Technology-dependent solutions: transistor per switch



# Shifter Summary

- Shifts common in logical ops, also in arithmetic
- MIPS has:
  - 2 flavors of shift: logical and arithmetic
  - 2 directions of shift: right and left  
Assignment Project Exam Help
  - 2 sources for shift amount: immediate, variable  
<https://tutorecs.com>
- Lots of cool shift algorithms, but  
WeChat: cstutorcs
  - Barrel shifter prevalent in today's hardware