

---

---

---

---

---



# assembling language.

## (I) History of Intel processors and architectures

(II) C, assembly, machine code

(III) Assembly Basics: Registers, operands, move

(IV) : Arithmetic & logical operations,

### Intel x86 processors

(I) Dominate laptop/desktop/server mkt

(II) Evolutionary design:

(i) Backwards compatible up until 8086, introduced in 1978

(ii) Added more features as time goes on.

(III) Complex instruction set computer (CISC)

(i) Many different instructions with many different formats

But, only small subset encountered with Linux programs.

(ii) Hard to match performance of Reduced Instruction Set  
computers (RISC) fight 1990s.

(iii) But, Intel has just done that.

In terms of speed. Less so for low power.

1985 IA32 First 32-bit Intel processor.

{ frequency  
multi-processor on a single chip  
↳ multicore  
haven't machine (2018)

cache: a temporary made used to hold the most recently needed data, so you can access it more quickly.

## Definitions:

(I) **Architecture** (also ISA: instruction set architecture) The parts of a processor that one needs to understand or write assembly/machine code.

e.g. Instruction set specification, registers.

(II) **Microarchitecture**: implementation of the architecture.

e.g. cache sizes and core frequency.

(III) **Code forms**:

(i) **Machine code**: the byte-level programs that a processor executes

(ii) **Assembly code**: A text representation of machine code.

## Example ISAs:

(i) Intel: x86, IA32, Titanium, x86-64.

(ii) ARM: Used in almost all cell phones.

# Assembly / Machine Code View



Programmer-visible state:

(I) PC: Program counter

(i) Address of next instruction

(ii) called "RIP" (x86-64)

logically, an array of bytes

(II) Memory:

(i) byte addressable array

(ii) code and user data

(III) Register file:

(i) Heavily used program data.  
(give name to)

(iii) stack to support procedures

virtual memory :-

seems that each project  
independent memory, ...  
but physically, share the  
same memory ...

(IV) Condition codes:

Some status information about most  
recent arithmetic or logical operation.

Data for conditional branching.

## Turning C into Object Code.

(I) Code in files p1.c p2.c

(II) (Compile with command):  $gcc -Og \text{ p1.c p2.c -o p}$

(i) Use basic optimization ( $-Og$ ) [New to recent versions of GCC]

(ii) Put resulting binary in file p.



merge together all the different files for both your individual file, and lib. codes.

## Assembly Characteristics: Data Types

(I) "Integer" data of 1, 2, 4 or 8 bytes.

(i) Data values stored as numbers in computer.

(ii) Addresses (untyped pointers)

(II) Floating point data of 4, 8 or 10 bytes

(III) Code: Byte sequences encoding series of instructions

(IV): No aggregate types such as arrays or structures

(i) Just contiguously allocated bytes in memory.

jeq assembly-code. fxf

and sum.s.

## Assembly Characteristics: Operations.

(I) Perform arithmetic function on register or memory data

(II) Transfer data between memory into register.

(i) Load data from memory into register

(ii) Store register data into memory

(III) Transfer control

(i) Unconditional jumps to/from procedures

(ii) Conditional branches. (e.g. do which if switch

## Object Code:

Code for sumstore:

0x0400595:

0x53

0x48

0x89

0xd3

0xe3

0xf2

0xff *(\*) Total of 14 bytes.*

0xff *(\*) Each instruction:*

0x48 *1, 3 or 5 bytes.*

0x89 *(\*) starts at address*

0x03 *0x0400595*

0x56

0xc3

## (I) Assembler

(i) Translates .s into .o

(ii) Binary encoding of each instruction.

(iii) Nearly-complete image of executable code.

(iv). Missing linkages between code in different files

## (II) Linker:

(i) Resolves references between files

(ii) Combines with static run-time libraries

E.g. code for malloc, printf.

(iii) Some libraries are dynamically linked

(i) Linking occurs when program being execution

## Machine Instruction Example.

\* dest = t;

← (I) C Code:

(i) store value t where designated by dest.

move %rax, (%rbx)

← (II) Assembly:

(i) Move 8-byte value to memory

(ii) Quad words in x86-64 parlance.

0x40059e: 48 89 03

(iii) Operands

t: register %rax

dest: register %rbx

\* dest: memory m[%rbx]

(IV) Object code

3-byte instruction

stored at address 0x40059e

# Disassembling Object code

Disassembled:

(I) Disassembler:

objdump -d sum

- (i) Useful tool for examining object code
- (ii) Analyzes bit pattern of series of instructions.
- (iii) produces approximate rendition of assembly code
- (iv). can be run on either a.out (complete executable) or .o file.

See disassembler.txt



& sum.d.

What can be Disassembled?

- (I) Anything that can be interpreted as executable code.
- (II) Disassembler examines bytes and reconstructs assembly source

# X86-64 Integer Registers

| 64 bits | 32 bits | 64 bits |
|---------|---------|---------|
| %rax    | %eax    | %r8     |
| %rbx    | %ebx    | %r9     |
| %rcx    | %ecx    | %r10    |
| %rdx    | %edx    | %r11    |
| %rsi    | %edi    | %r12    |
| %rdi    | %edi    | %r13    |
| %rsp    | %esp    | %r14    |
| %rdp    | %edp    | %r15    |

Stack pointers

(an reference low-order 4 bytes (also low-order 1&2 bytes)

## (I) Moving Data:

`movq source, dest`

### (II) Operand Types:

(i) Immediate: constant integer data.

`%rax`

(ii) Example : `$0x400`      `$-533`.

`%rcx`

(iii) Like C constants, but prefixed with '\$'.

`%rdx`

(iv) Register: One of 16 integer registers.

`%rsp`

(v) Example: `%rax`, `%rb3`

`%rbp`

(vi) But `%rsp` reserved for special use.

`%rN`

(vii) Others have special uses for particular instructions.

(viii) Memory: 8 consecutive bytes of memory at address given by register.

(i) Simplest example (`%rax`)

(ii) Various other "address modes".

`movq`    Operand Combinations:



Cannot do memory transfer with a single instruction.

## Simple Memory Addressing Modes.

(I) Normal (R)  $\text{Mem}[\text{Reg}(R)]$

- Register R specifies memory address
- Aha! Pointer dereferencing in C.

$\text{movq}(\%r(x), \%rax)$

dereferencing a ptr. and put it into a temporary.

(II) Displacement D(R)  $\text{Mem}[\text{Reg}(R) + D]$

- Register R specifies start of memory region
- Constant displacement D specifies offset

$\text{movq} 8(\%rbp), \%rdx$

displacement D

add 8 to get an address slightly offset by some fixed amount.

e.g.

void swap

(long \*xp, long \*yp)

$\Rightarrow$

long t0 = \*xp // read memory into register

long t1 = \*yp

\*xp = t1 // ..

\*yp = t0 // copy back  
but reverse ..

swap:

$\text{movq}(\%rdi), \%rax \#t0=xp$

$\text{movq}(\%rsi), \%rdx \#t1=yp$

$\text{movq}(\%rdx), (\%rdi) \#xp=t1$

$\text{movq}(\%rax), (\%rsi) \#yp=t0$

ret



## Complete Memory Addressing Modes

(I) Most General Form:

$D(R_b, R_i, S)$

$\text{Mem}[\text{Reg}[R_b] + S \times \text{Reg}[R_i] + D]$

- (i)  $D$ : Constant "displacement" 1, 2, or 4 bytes
- (ii)  $R_b$ : Base register: Any of 16 integer registers
- (iii)  $R_i$ : Index register: Any, except for %rsp
- (iv)  $S$ : Scale: 1, 2, 4 or 8 (Why these numbers?)

A natural way to implement array.

int type: 4

long type: 8

(II) Special cases:

$(R_b, R_i)$

$\text{Mem}[\text{Reg}[R_b] + \text{Reg}[R_i]]$

$D(R_b, R_i)$

$\text{Mem}[\text{Reg}[R_b] + \text{Reg}[R_i] + D]$

$(R_b, R_i, S)$

$\text{Mem}[\text{Reg}[R_b] + S \times \text{Reg}[R_i]]$

# Address Computation Examples

$\%rdx \quad \text{0x}f000$

$\%rcx \quad \text{0x}1000$

| Expression          | Address Computation  | Address                        |
|---------------------|----------------------|--------------------------------|
| $0x8(\%rdx)$        | $0xf000 + 0x8$       | <del><math>0xf008</math></del> |
| $(\%rdx, \%rcx)$    | $0xf000 + 0x100$     | <del><math>0xf100</math></del> |
| $(\%rdx, \%rcx, 4)$ | $0xf000 + 4 * 0x100$ | <del><math>0xf400</math></del> |
| $0x80(\%rdx, 2)$    | $2 * 0xf000 + 0x80$  | $0xe080$                       |
|                     |                      |                                |

## Address Computation Instruction

(i) leaq Src, Dst

Src is address mode expression.

Set Dst to address denoted by expression.

(ii) Use

(i) Computing address without a memory reference.

E.g., translation of  $p = \&x[i]$

(ii) Computing arithmetic expressions of the form  $x + kx$

$k = 1, 2, 4$  or  $8$ .

$$\%rax := \%rdi + 2 * \%rdi$$

Example

long m12 (long x)

{ return x \* 12;



leaq (%rdi,%rdi,2), %rax #

salq \$2,%rax

Shift left by 2

`Mov:` } could either (1) refer to a register...  
} (2) or a memory location.

see

some-arithmetic-operations.txt

The code you generated will correctly receive your implementations  
but it might not exactly replicate at a low level/ the exact  
sequence of the operations you specified at a high level-

## Glossary:

quadword: A numerical value of four times the magnitude of a word, thus typically 64 bits.