

# x86 Programming I

CSE 351 Winter 2017



<http://xkcd.com/409/>

# Administrivia

- ❖ Lab 2 released!
  - Da bomb!



# Roadmap

C:

```
car *c = malloc(sizeof(car));  
c->miles = 100;  
c->gals = 17;  
float mpg = get_mpg(c);  
free(c);
```

Java:

```
Car c = new Car();  
c.setMiles(100);  
c.setGals(17);  
float mpg =  
    c.getMPG();
```

Assembly language:

```
get_mpg:  
    pushq %rbp  
    movq %rsp, %rbp  
    ...  
    popq %rbp  
    ret
```

Machine code:

```
0111010000011000  
1000110100000100000000010  
1000100111000010  
1100000111110100001111
```

Computer system:



Memory & data  
Integers & floats  
Machine code & C  
**x86 assembly**  
Procedures & stacks  
Arrays & structs  
Memory & caches  
Processes  
Virtual memory  
Memory allocation  
Java vs. C

OS:



# x86 Topics for Today

- ❖ Registers
- ❖ Move instructions and operands
- ❖ Arithmetic operations
- ❖ Memory addressing modes
- ❖ swap example

# What is a Register?

- ❖ A location in the CPU that stores a small amount of data, which can be accessed very quickly (once every clock cycle)
- ❖ Registers have *names*, not *addresses*
  - In assembly, they start with % (e.g., %rsi)
- ❖ Registers are at the heart of assembly programming
  - They are a precious commodity in all architectures, but *especially* x86

# x86-64 Integer Registers – 64 bits wide

|      |      |
|------|------|
| %rax | %eax |
| %rbx | %ebx |
| %rcx | %ecx |
| %rdx | %edx |
| %rsi | %esi |
| %rdi | %edi |
| %rsp | %esp |
| %rbp | %ebp |

|      |       |
|------|-------|
| %r8  | %r8d  |
| %r9  | %r9d  |
| %r10 | %r10d |
| %r11 | %r11d |
| %r12 | %r12d |
| %r13 | %r13d |
| %r14 | %r14d |
| %r15 | %r15d |

- Can reference low-order 4 bytes (also low-order 2 & 1 bytes)

# Some History: IA32 Registers – 32 bits wide



# x86-64 Assembly Data Types

- ❖ “Integer” data of 1, 2, 4, or 8 bytes
  - Data values
  - Addresses (untyped pointers)
- ❖ Floating point data of 4, 8, 10 or 2x8 or 4x4 or 8x2
  - Different registers for those (e.g. %xmm1, %ymm2)
  - Come from *extensions to x86* (SSE, AVX, ...)
  - Probably won’t have time to get into these ☹
- ❖ No aggregate types such as arrays or structures
  - Just contiguously allocated bytes in memory
- ❖ Two common syntaxes
  - “AT&T”: used by our course, slides, textbook, gnu tools, ...
  - “Intel”: used by Intel documentation, Intel tools, ...
  - Must know which you’re reading

# Three Basic Kinds of Instructions

## 1) Transfer data between memory and register

- *Load* data from memory into register
  - $\%reg = \text{Mem}[address]$
- *Store* register data into memory
  - $\text{Mem}[address] = \%reg$

**Remember:** Memory is indexed just like an array of bytes!

## 2) Perform arithmetic operation on register or memory data

- $c = a + b;$        $z = x \ll y;$        $i = h \& g;$

## 3) Control flow: what instruction to execute next

- Unconditional jumps to/from procedures
- Conditional branches

# Operand types

- ❖ **Immediate:** Constant integer data

- Examples: `$0x400`, `$-533`
- Like C literal, but prefixed with `'$'`
- Encoded with 1, 2, 4, or 8 bytes  
*depending on the instruction*

- ❖ **Register:** 1 of 16 integer registers

- Examples: `%rax`, `%r13`
- But `%rsp` reserved for special use
- Others have special uses for particular instructions

- ❖ **Memory:** Consecutive bytes of memory at a computed address

- Simplest example: (`%rax`)
- Various other “address modes”

`%rax`

`%rcx`

`%rdx`

`%rbx`

`%rsi`

`%rdi`

`%rsp`

`%rbp`

`%rN`

# Moving Data

- ❖ General form: `mov_ source, destination`
  - Missing letter (\_) specifies size of operands
  - Note that due to backwards-compatible support for 8086 programs (16-bit machines!), “word” means 16 bits = 2 bytes in x86 instruction names
  - Lots of these in typical code
  
- ❖ `movb src, dst`
  - Move 1-byte “byte”
- ❖ `movw src, dst`
  - Move 2-byte “word”
- ❖ `movl src, dst`
  - Move 4-byte “long word”
- ❖ `movq src, dst`
  - Move 8-byte “quad word”

# movq Operand Combinations

|      | Source | Dest | Src, Dest           | C Analog       |
|------|--------|------|---------------------|----------------|
| movq | Imm    | Reg  | movq \$0x4, %rax    | var_a = 0x4;   |
|      |        | Mem  | movq \$-147, (%rax) | *p_a = -147;   |
|      | Reg    | Reg  | movq %rax, %rdx     | var_d = var_a; |
|      | Mem    | Mem  | movq %rax, (%rdx)   | *p_d = var_a;  |
|      |        | Reg  | movq (%rax), %rdx   | var_d = *p_a;  |

- ❖ *Cannot do memory-memory transfer with a single instruction*
  - How would you do it?

# Memory vs. Registers

## ❖ Addresses

- 0x7FFFD024C3DC

## ❖ Big

- ~ 8 GiB

## ❖ Slow

- ~50-100 ns

## ❖ Dynamic

- Can “grow” as needed while program runs

## vs. Names

%rdi

## vs. Small

(16 x 8 B) = 128 B

## vs. Fast

sub-nanosecond timescale

## vs. Static

fixed number in hardware

# Some Arithmetic Operations

## ❖ Binary (two-operand) Instructions:

- Maximum of one memory operand
- Beware argument order!
- No distinction between signed and unsigned
  - Only arithmetic vs. logical shifts
- How do you implement

“r3 = r1 + r2”?

| Format                       | Computation        |                        |
|------------------------------|--------------------|------------------------|
| <b>addq</b> <i>src, dst</i>  | $dst = dst + src$  | ( $dst += src$ )       |
| <b>subq</b> <i>src, dst</i>  | $dst = dst - src$  |                        |
| <b>imulq</b> <i>src, dst</i> | $dst = dst * src$  | signed mult            |
| <b>sarq</b> <i>src, dst</i>  | $dst = dst >> src$ | Arithmetic             |
| <b>shrq</b> <i>src, dst</i>  | $dst = dst >> src$ | Logical                |
| <b>shlq</b> <i>src, dst</i>  | $dst = dst << src$ | (same as <b>salq</b> ) |
| <b>xorq</b> <i>src, dst</i>  | $dst = dst ^ src$  |                        |
| <b>andq</b> <i>src, dst</i>  | $dst = dst \& src$ |                        |
| <b>orq</b> <i>src, dst</i>   | $dst = dst   src$  |                        |

↑ operand size specifier

# Some Arithmetic Operations

- ❖ Unary (one-operand) Instructions:

| Format                 | Computation      |                    |
|------------------------|------------------|--------------------|
| <b>incq</b> <i>dst</i> | $dst = dst + 1$  | increment          |
| <b>decq</b> <i>dst</i> | $dst = dst - 1$  | decrement          |
| <b>negq</b> <i>dst</i> | $dst = -dst$     | negate             |
| <b>notq</b> <i>dst</i> | $dst = \sim dst$ | bitwise complement |

- ❖ See CSPP Section 3.5.5 for more instructions:  
`mulq`, `cqto`, `idivq`, `divq`

# Arithmetic Example

```
long simple_arith(long x, long y)
{
    long t1 = x + y;
    long t2 = t1 * 3;
    return t2;
}
```

| Register | Use(s)                       |
|----------|------------------------------|
| %rdi     | 1 <sup>st</sup> argument (x) |
| %rsi     | 2 <sup>nd</sup> argument (y) |
| %rax     | return value                 |

```
y += x;
y *= 3;
long r = y;
return r;
```

```
simple_arith:
    addq    %rdi, %rsi
    imulq   $3, %rsi
    movq    %rsi, %rax
    ret
```

# Example of Basic Addressing Modes

```
void swap(long *xp, long *yp)
{
    long t0 = *xp;
    long t1 = *yp;
    *xp = t1;
    *yp = t0;
}
```

```
swap:
    movq (%rdi), %rax
    movq (%rsi), %rdx
    movq %rdx, (%rdi)
    movq %rax, (%rsi)
    ret
```

# Understanding swap()

```
void swap(long *xp, long *yp)
{
    long t0 = *xp;
    long t1 = *yp;
    *xp = t1;
    *yp = t0;
}
```

## Registers



## Memory



swap:

```
    movq (%rdi), %rax
    movq (%rsi), %rdx
    movq %rdx, (%rdi)
    movq %rax, (%rsi)
    ret
```

## Register

## Variable

|      |   |    |
|------|---|----|
| %rdi | ↔ | xp |
| %rsi | ↔ | yp |
| %rax | ↔ | t0 |
| %rdx | ↔ | t1 |

# Understanding swap()

**Registers**

|      |       |
|------|-------|
| %rdi | 0x120 |
| %rsi | 0x100 |
| %rax |       |
| %rdx |       |

**Memory****Word Address**

|       |     |
|-------|-----|
| 0x120 | 123 |
| 0x118 |     |
| 0x110 |     |
| 0x108 |     |
| 0x100 | 456 |

**swap:**

```
    movq (%rdi), %rax    # t0 = *xp
    movq (%rsi), %rdx    # t1 = *yp
    movq %rdx, (%rdi)    # *xp = t1
    movq %rax, (%rsi)    # *yp = t0
    ret
```

# Understanding swap()



swap:

```
    movq    (%rdi), %rax    # t0 = *xp
    movq    (%rsi), %rdx    # t1 = *yp
    movq    %rdx, (%rdi)    # *xp = t1
    movq    %rax, (%rsi)    # *yp = t0
    ret
```

# Understanding swap()



```
swap:
```

```
    movq (%rdi), %rax    # t0 = *xp
    movq (%rsi), %rdx    # t1 = *yp
    movq %rdx, (%rdi)    # *xp = t1
    movq %rax, (%rsi)    # *yp = t0
    ret
```

# Understanding swap()



```
swap:  
    movq (%rdi), %rax    # t0 = *xp  
    movq (%rsi), %rdx    # t1 = *yp  
    movq %rdx, (%rdi)    # *xp = t1  
    movq %rax, (%rsi)    # *yp = t0  
    ret
```

# Understanding swap()



```
swap:
```

```
    movq (%rdi), %rax    # t0 = *xp
    movq (%rsi), %rdx    # t1 = *yp
    movq %rdx, (%rdi)    # *xp = t1
    movq %rax, (%rsi)    # *yp = t0
    ret
```

# Memory Addressing Modes: Basic

❖ **Indirect:** (R) Mem[Reg[R]]

- Data in register R specifies the memory address
- Like pointer dereference in C
- Example: movq (%rcx), %rax

❖ **Displacement:** D (R) Mem[Reg[R]+D]

- Data in register R specifies the *start* of some memory region
- Constant displacement D specifies the offset from that address
- Example: movq 8(%rbp), %rdx

# Complete Memory Addressing Modes

## ❖ General:

- $D(Rb, Ri, S) \quad \text{Mem}[\text{Reg}[Rb] + \text{Reg}[Ri] * S + D]$ 
  - $Rb$ : Base register (any register)
  - $Ri$ : Index register (any register except  $\%rsp$ )
  - $S$ : Scale factor (1, 2, 4, 8) – *why these numbers?*
  - $D$ : Constant displacement value (a.k.a. immediate)

## ❖ Special cases (see CSPP Figure 3.3 on p.181)

- $D(Rb, Ri) \quad \text{Mem}[\text{Reg}[Rb] + \text{Reg}[Ri] + D] \quad (S=1)$
- $(Rb, Ri, S) \quad \text{Mem}[\text{Reg}[Rb] + \text{Reg}[Ri] * S] \quad (D=0)$
- $(Rb, Ri) \quad \text{Mem}[\text{Reg}[Rb] + \text{Reg}[Ri]] \quad (S=1, D=0)$
- $(, Ri, S) \quad \text{Mem}[\text{Reg}[Ri] * S] \quad (Rb=0, D=0)$

# Address Computation Examples

|      |        |
|------|--------|
| %rdx | 0xf000 |
| %rcx | 0x0100 |

$D(Rb, Ri, S) \rightarrow$   
 $\text{Mem}[\text{Reg}[Rb] + \text{Reg}[Ri] * S + D]$

| Expression      | Address Computation | Address |
|-----------------|---------------------|---------|
| 0x8(%rdx)       |                     |         |
| (%rdx, %rcx)    |                     |         |
| (%rdx, %rcx, 4) |                     |         |
| 0x80(, %rdx, 2) |                     |         |

# Address Computation Examples

|      |        |
|------|--------|
| %rdx | 0xf000 |
| %rcx | 0x0100 |

$D(Rb, Ri, S) \rightarrow$   
 $\text{Mem}[\text{Reg}[Rb] + \text{Reg}[Ri] * S + D]$

| Expression      | Address Computation  | Address |
|-----------------|----------------------|---------|
| 0x8(%rdx)       | $0xf000 + 0x8$       | 0xf008  |
| (%rdx, %rcx)    | $0xf000 + 0x100$     | 0xf100  |
| (%rdx, %rcx, 4) | $0xf000 + 0x100 * 4$ | 0xf400  |
| 0x80(,%rdx,2)   | $0xf000 * 2 + 0x80$  | 0x1e080 |

# Peer Instruction Question

- ❖ Which of the following statements is TRUE?
  - (A) The program counter (%rip) is a register that we manually manipulate
  - (B) There is only one way to compile a C program into assembly
  - (C) Mem to Mem (src to dst) is the only disallowed operand combination
  - (D) We can compute an address without using any registers

# Summary

- ❖ **Registers** are named locations in the CPU for holding and manipulating data
  - x86-64 uses 16 64-bit wide registers
- ❖ Assembly instructions have rigid form
  - Operands include immediates, registers, and data at specified memory locations
  - Many instruction variants based on size of data
- ❖ **Memory Addressing Modes:** The addresses used for accessing memory in `mov` (and other) instructions can be computed in several different ways
  - *Base register, index register, scale factor, and displacement* map well to pointer arithmetic operations