

# Topic V34

Average Memory Access Time

Reading: (Section 5.4)

# Average Access Time

Hit time is also important for performance

$$\text{Average memory access time (AMAT)} = \frac{\text{Total time to access memory}}{\text{Number of memory accesses}}$$

Units:

$$\frac{ns}{\text{access}}$$

$$\frac{\text{cycles}}{\text{access}}$$

# Average Memory Access Time (example)

Find the AMAT for a processor with a 1 ns clock cycle time,  
a miss penalty of 20 clock cycles,  
a miss rate of 0.05 misses per access,  
and a cache access time (including hit detection) of 1 clock cycle.  
Assume that the read and write miss penalties are the same and ignore  
other write stalls.



# Average Memory Access Time (example)

Find the AMAT for a processor with a 1 ns clock cycle time,  
a miss penalty of 20 clock cycles,  
a miss rate of 0.05 misses per access,

and a cache access time (including hit detection) of 1 clock cycle.

Assume that the read and write miss penalties are the same and ignore  
other write stalls.



# Average Memory Access Time (example)

Find the AMAT for a processor with a 1 ns clock cycle time,  
a miss penalty of 20 clock cycles,  
a miss rate of 0.05 misses per access,  
and a cache access time (including hit detection) of 1 clock cycle.

Assume that the read and write miss penalties are the same and ignore other write stalls.



# Average Memory Access Time (example)

Find the AMAT for a processor with a 1 ns clock cycle time,

a miss penalty of 20 clock cycles,

a miss rate of 0.05 misses per access,

and a cache access time (including hit detection) of 1 clock cycle.

Assume that the read and write miss penalties are the same and ignore other write stalls.



$$\text{AMAT} = \frac{1000 \square 1 \text{ cycle} + 50 \square 20 \text{ cycles}}{1000 \text{ accesses}} = 2 \frac{\text{cycles}}{\text{access}} = 2 \text{ns}$$

# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access.

The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

$$\text{AMAT} = \frac{\text{\# of clock cycle}}{\text{\# of accesses}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access.

The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

$$\text{AMAT} = \frac{\text{\# of clock cycle}}{\text{\# of accesses}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time,

a miss penalty of 100 clock cycles,

an I-cache miss rate of 0.03 misses per instruction,

and a D-cache miss rate of 0.1 misses per access.

The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

$$\text{AMAT} = \frac{\text{\# of clock cycle}}{\text{\# of accesses}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access.

The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

$$\text{AMAT} = \frac{\# \text{ of clock cycle}}{\# \text{ of accesses}} = \frac{1000 \times 1 \text{ cycle}}{1200 \text{ accesses}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access.

The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

$$\text{AMAT} = \frac{\# \text{ of clock cycle}}{\# \text{ of accesses}} = \frac{1000 \times 1 \text{ cycle}}{1200 \text{ accesses}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access.

The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

$$\text{AMAT} = \frac{\# \text{ of clock cycle}}{\# \text{ of accesses}} = \frac{1000 \times 1 \text{ cycle} + 50 \times 100 \text{ cycles}}{1200 \text{ accesses}} = 5 \frac{\text{cycles}}{\text{access}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store

running on a processor with a 1 ns clock cycle time,

a miss penalty of 100 clock cycles,

an I-cache miss rate of 0.03 misses per instruction,

and a D-cache miss rate of 0.1 misses per access.

The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

$$\text{AMAT} = \frac{\# \text{ of clock cycle}}{\# \text{ of accesses}} = \frac{1000 \times 1 \text{ cycle} + 50 \times 100 \text{ cycles}}{1200 \text{ accesses}} = 5 \frac{\text{cycles}}{\text{access}} = 5 \text{ ns}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access. The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

What is the new AMAT if an L2 cache is added to this memory hierarchy.

This L2 has an access time of 20 cycles and a local miss rate of 20%.

An access to the memory is only started after the access to the L2 (miss or hit) is completed.

$$\text{AMAT} = \frac{\# \text{ of clock cycle}}{\# \text{ of accesses}} = \frac{1000 \times 1 \text{ cycle}}{1200 \text{ accesses}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access. The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

What is the new AMAT if an L2 cache is added to this memory hierarchy.

This L2 has an access time of 20 cycles and a local miss rate of 20%.

An access to the memory is only started after the access to the L2 (miss or hit) is completed.

$$\text{AMAT} = \frac{\# \text{ of clock cycle}}{\# \text{ of accesses}} = \frac{1000 \times 1 \text{ cycle}}{1200 \text{ accesses}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access. The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

What is the new AMAT if an L2 cache is added to this memory hierarchy.

This L2 has an access time of 20 cycles and a local miss rate of 20%.

An access to the memory is only started after the access to the L2 (miss or hit) is completed.

$$\text{AMAT} = \frac{\# \text{ of clock cycle}}{\# \text{ of accesses}} = \frac{1000 \times 1 \text{ cycle}}{1200 \text{ accesses}}$$



# Average Memory Access Time (example)

Find the AMAT for a program where 20% of the instructions are load/store running on a processor with a 1 ns clock cycle time, a miss penalty of 100 clock cycles, an I-cache miss rate of 0.03 misses per instruction, and a D-cache miss rate of 0.1 misses per access. The access time (including hit detection) for both the I-cache and the D-cache is 1 clock cycle.

What is the new AMAT if an L2 cache is added to this memory hierarchy.

This L2 has an access time of 20 cycles and a local miss rate of 20%.

An access to the memory is only started after the access to the L2 (miss or hit) is completed.

$$\text{AMAT} = \frac{\# \text{ of clock cycles}}{\# \text{ of accesses}} = \frac{1000 \times 1 \text{ cycle} + 50 \times 20 \text{ cycles} + 10 \times 100 \text{ cycles}}{1200 \text{ accesses}}$$
$$= 2.5 \frac{\text{cycles}}{\text{access}}$$



How much faster is the machine with L2 in comparison with the machine without L2?

$$AMAT_{no\ L2} = \frac{1000 \times 1\ cycle + 50 \times 100\ cycles}{1200\ accesses} = 5 \frac{\text{cycles}}{\text{access}} = 5\ ns$$

$$AMAT_{with\ L2} = \frac{1000 \times 1\ cycle + 50 \times 20\ cycles + 10 \times 100\ cycles}{1200\ accesses} = 2.5 \frac{\text{cycles}}{\text{access}}$$

Given that the basic CPI (with ideal caches) is the same, and the number of accesses is the same, we can simply divide the AMATs:

$$\text{Speedup} = \frac{AMAT_{no\ L2}}{AMAT_{with\ L2}} = \frac{5\ ns}{2.5\ ns} = 2$$

Design with L2 is two times faster

# Local vs. Global Hit Rate

What is the *local hit rate* of the L2 cache?

$$\text{Local hit rate}_{L2} = \frac{\# \text{ accesses that hit L2}}{\# \text{ accesses that reach L2}} = \frac{40}{50} = 80\%$$

What is the *global hit rate* of the L2 cache?

$$\text{global hit rate}_{L2} = \frac{\# \text{ accesses that hit L2}}{\text{Total # of accesses}} = \frac{40}{1200} = 6.7\%$$



# Performance Summary

When CPU performance increases

- Miss penalty becomes more significant

Decreasing base CPI

- Greater proportion of time spent on memory stalls

Increasing clock rate

- Memory stalls account for more CPU cycles