

# ECE 466 Project 1

Name: Venkata Ramkiran Chevendra

UIN: 673672306

1. The command used to run the simulation is

**./sim -outorder -fastfwd 300000000 -max:inst 200000000 quake.ss<quake.in**

Sim-out order is used to implement very detailed out of order speculative execution. The first 300 million instruction are forwarded and the execution will be done on the next 200 million instructions. Quake.ss < quake.in represents the file name.

The command executed is as shown in the screenshot below



```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0$ gcc `./sysprobe -flags` -DDEBUG -O0 -g -Wall -c resource.c
gcc `./sysprobe -flags` -DDEBUG -O0 -g -Wall -c ptrace.c
gcc -o sim-outorder `./sysprobe -flags` -DDEBUG -O0 -g -Wall sim-outorder.o cache.o bpred.o resource.o ptrace.o main.o syscall.o memory.o regs.o loader.o endian.o dlite.o symbol.o eval.o options.o stats.o eio.o range.o misc.o machine.o libexo.libexo.a `./sysprobe -libs` -lm
my work is done here...
ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim -outorder -fastfwd 300000000 -max:inst 200000000 quake.ss<quake.in
bash: ./sim-outorder: No such file or directory
ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim -outorder -fastfwd 300000000 -max:inst 200000000 quake.ss<quake.in
sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of August, 2003.
Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.
All Rights Reserved. This version of SimpleScalar is licensed for academic
non-commercial use. No portion of this work may be used by any commercial
entity, or for any commercial purpose, without the prior written permission
of SimpleScalar, LLC (info@simplescalar.com).

sim: command line: ./sim-outorder -fastfwd 300000000 -max:inst 200000000 quake.ss
sim: simulation started @ Sun Nov  6 23:23:27 2016, options follow:
sim-outorder: This simulator implements a very detailed out-of-order issue
superscalar processor with a two-level memory system and speculative
execution support. This simulator is a performance simulator, tracking the
latency of all pipeline operations.

# -config           # load configuration from a file
# -dumpconfig       # dump configuration to a file
# -h                false # print help message
# -v                false # verbose operation
```

The simulation statistics are as below:

The total number of instructions executed: 200000001

The total number of loads and stores executed: 68852990

The total number of branches executed: 55093698

Total simulation time in seconds: 128

Simulation speed in inst/sec: 1562500.0078



```

ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 11:36 PM
  -cache:dl1 dl1:256:32:1:l -cache:dl2 ul2:1024:64:2:l
Or, a fully unified cache hierarchy (il1 pointed at dl1):
  -cache:il1 dl1
  -cache:dl1 ul1:256:32:1:l -cache:dl2 ul2:1024:64:2:l

sim: ** fast forwarding 300000000 insts **
equake00: Reading nodes.
equake00: Reading elements.
sim: ** starting performance simulation **

sim: ** simulation statistics **
sim_num_insn          200000001 # total number of instructions committed
sim_num_refs           65174179 # total number of loads and stores committed
sim_num_loads          45457319 # total number of loads committed
sim_num_stores          19716860.0000 # total number of stores committed
sim_num_branches        52662105 # total number of branches committed
sim_elapsed_time        128 # total simulation time in seconds
sim_inst_rate           1562500.0078 # simulation speed (in insts/sec)
sim_total_insn          211150190 # total number of instructions executed
sim_total_refs          68852990 # total number of loads and stores executed
sim_total_loads         48245602 # total number of loads executed
sim_total_stores         20607388.0000 # total number of stores executed
sim_total_branches       55093698 # total number of branches executed
sim_cycle               1199080810 # total simulation time in cycles
sim_IPC                 1.6795 # instructions per cycle
sim_CPI                 0.5954 # cycles per instruction
sim_exec_BW              1.7732 # total instructions (mis-spec + committed)
per_cycle
sim_IPB                 3.7978 # instruction per branch
IFQ_count                303118298 # cumulative IFQ occupancy

```

### Performance Analysis:

The performance of the program when simulated out of order is:

Instruction per cycle(IPC): 1.6795

Cycles per instruction(CPI): 0.5954

Total simulation time in cycles: 1199080810

2. The sim-out order instruction is now used to analyze the performance of in order execution.

The command used to perform the task is

**./sim -outorder -issue:inorder -fastfwd 300000000 -max:inst 200000000 equake.ss<quake.in**

Sim-out order is used to implement very detailed out of order speculative execution. The first 300 million instruction are forwarded and the execution will be done on the next 200 million instructions. Equake.ss < quake.in represents the file name. issue-inorder is used to run the pipeline with inorder issue.

### The Simulation statistics found out are:

|                                            |              |
|--------------------------------------------|--------------|
| Total number of instructions executed:     | 200000000    |
| Total number of loads and stores executed: | 65531062     |
| Total number of branches executed:         | 52662188     |
| Total simulation time in seconds:          | 118          |
| Simulation speed (in insts/sec):           | 1694915.2542 |

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 11:44 PM
mem.page_mem          4700k # total size of memory pages allocated
mem.ptab_misses      6279 # total first level page table misses
mem.ptab_accesses    3277288964 # total page table accesses
mem.ptab_miss_rate   0.0000 # first level page table miss rate

ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim
.outdrder -issue:inorder -fastfwd 300000000 -max:inst 200000000 equake.ss<equake
.in
bash: ./sim-outdrder: No such file or directory
ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim
.outorder -issue:inorder -fastfwd 300000000 -max:inst 200000000 equake.ss<equake
.in
sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of August, 2003.
Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.
All Rights Reserved. This version of SimpleScalar is licensed for academic
non-commercial use. No portion of this work may be used by any commercial
entity, or for any commercial purpose, without the prior written permission
of SimpleScalar, LLC (info@simplescalar.com).

sim: command line: ./sim-outorder -issue:inorder -fastfwd 300000000 -max:inst 20
0000000 equake.ss

sim: simulation started @ Sun Nov  6 23:40:54 2016, options follow:

sim-outorder: This simulator implements a very detailed out-of-order issue
superscalar processor with a two-level memory system and speculative
execution support. This simulator is a performance simulator, tracking the
latency of all pipeline operations.

# -config                  # load configuration from a file
# -dumpconfig              # dump configuration to a file
# -h                        false # print help message
# -v                        false # verbose operation
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 11:48 PM
equake00: Reading nodes.
equake00: Reading elements.
sim: ** starting performance simulation **

sim: ** simulation statistics **
sim_num_insn           200000000 # total number of instructions committed
sim_num_refs            65174178 # total number of loads and stores committed
sim_num_loads           45457318 # total number of loads committed
sim_num_stores          19716860.0000 # total number of stores committed
sim_num_branches        52662105 # total number of branches committed
sim_elapsed_time         118 # total simulation time in seconds
sim_inst_rate           1694915.2542 # simulation speed (in insts/sec)
sim_total_insn          201542361 # total number of instructions executed
sim_total_refs          65531062 # total number of loads and stores executed
sim_total_loads          45813723 # total number of loads executed
sim_total_stores         19717339.0000 # total number of stores executed
sim_total_branches       52662188 # total number of branches executed
sim_cycle                257538695 # total simulation time in cycles
sim_IPC                 0.7766 # instructions per cycle
sim_CPI                 1.2877 # cycles per instruction
sim_exec_BW               0.7826 # total instructions (mis-spec + committed)

per cycle
sim_IPB                 3.7978 # instruction per branch
IFQ_count                894125557 # cumulative IFQ occupancy
IFQ_fcount               211243056 # cumulative IFQ full count
ifq_occupancy             3.4718 # avg IFQ occupancy (insn's)
ifq_rate                  0.7826 # avg IFQ dispatch rate (insn/cycle)
ifq_latency                4.4364 # avg IFQ occupant latency (cycle's)
ifq_full                   0.8202 # fraction of time (cycle's) IFQ was full
RUU_count                696661267 # cumulative RUU occupancy
RUU_fcount                  0 # cumulative RUU full count
ruu_occupancy              2.7051 # avg RUU occupancy (insn's)
ruu_rate                   0.7826 # avg RUU dispatch rate (insn/cycle)
```

### **Performance Analysis:**

The performance of the program when it is simulated out of order is:

|                                  |           |
|----------------------------------|-----------|
| Instructions Per Cycle (IPC):    | 0.7766    |
| Cycles Per Instructions (CPI):   | 1.2877    |
| Total simulation time in cycles: | 257538695 |

### **Comparison between Inorder and Outorder issue**

Inorder Issue:

|                                |        |
|--------------------------------|--------|
| Instructions Per Cycle (IPC):  | 0.7766 |
| Cycles Per Instructions (CPI): | 1.2877 |

Outorder Issue:

|                                |        |
|--------------------------------|--------|
| Instructions Per Cycle (IPC):  | 1.6795 |
| Cycles Per Instructions (CPI): | 0.5954 |

From the performance analysis we see that the IPC value of In order issue is less and CPI of Out order issue is less. From this data we can observe that In order issue takes more time to execute than the Out order issue . So out order issue is faster and difference in CPI between out order issue and in order issue is:

**Performance of In order = IPC of In order = 0.7766**

**Performance of Out order = IPC of Out order = 1.6795**

**Performance Loss =**

$$\begin{aligned} & (\text{Performance of Out order} - \text{Performance of In order}) / \text{Performance of Out order} \\ &= (1.6795 - 0.7766) / 1.6795 = 0.5376 = 53.76 \% \end{aligned}$$

Thus, when we use in order issue, Performance reduces by 53.76%. So, it is better to use **OUTORDER ISSUE**

3. The default L1 data cache is 16KB with block size of 32B and associativity of 4. Now the L1 data cache size should be varied from 1 KB to 128 KB by keeping all the other parameters same and thus should find the optimal cache size for the program. This can be determined by the **cache miss rate** and **overall performance**

As the block size and associativity is kept constant, only number of sets should be varied to evaluate optimal cache size.

For calculation of 1 KB cache size the number of sets can be calculated as  $1024/(32*4) = 8$

The cache size can be incremented only in the powers of 2. So next will be 2KB which is calculated as  $2048/(32*4) = 16$

For 4 KB,  $4096/(32*4) = 32$  and so no.

So, the number of sets from 1 KB to 128 KB will be 8,16,32,64,128,256,512 and 1024 respectively.

The following command should be used to determine the statistics of the cache size:

```
./sim-outorder -cache:dl1 dl1:<number of sets>:32:4:l -fastfwd 300000000 -max:inst 200000000  
quake.ss<quake.in
```

For 1KB cache size the instruction will be as shown in the screenshot below:

```
./sim-outorder -cache:dl1 dl1:8:32:4:l -fastfwd 300000000 -max:inst 200000000  
quake.ss<quake.in
```



```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 5:39 PM  
mem.ptab_misses      6279 # total first level page table misses  
mem.ptab_accesses    3384066156 # total page table accesses  
mem.ptab_miss_rate   0.0000 # first level page table miss rate  
  
ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim  
-outorder -cache:dl1 dl1:8:32:4:l -fastfwd 300000000 -max:inst 200000000 quake.  
ss<quake.in  
sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of August, 2003.  
Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.  
All Rights Reserved. This version of SimpleScalar is licensed for academic  
non-commercial use. No portion of this work may be used by any commercial  
entity, or for any commercial purpose, without the prior written permission  
of SimpleScalar, LLC (info@simplescalar.com).  
  
sim: command line: ./sim-outorder -cache:dl1 dl1:8:32:4:l -fastfwd 300000000 -ma  
x:inst 2000000000 quake.ss  
  
sim: simulation started @ Mon Nov  7 17:35:09 2016, options follow:  
  
sim-outorder: This simulator implements a very detailed out-of-order issue  
superscalar processor with a two-level memory system and speculative  
execution support. This simulator is a performance simulator, tracking the  
latency of all pipeline operations.  
  
# -config          # load configuration from a file  
# -dumpconfig      # dump configuration to a file  
# -h               false # print help message  
# -v               false # verbose operation  
# -d               false # enable debug message  
# -i               false # start in Dlite debugger  
-seed              1 # random number generator seed (0 for timer seed)  
# -q               false # initialize and terminate immediately  
# -chkpt           <null> # restore EIO trace execution from <fname>
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↻ En ✡ 5:40 PM ⓘ
sim_num_refs          65174179 # total number of loads and stores committed
sim_num_loads         45457319 # total number of loads committed
sim_num_stores        19716860.0000 # total number of stores committed
sim_num_branches      52662105 # total number of branches committed
sim_elapsed_time       135 # total simulation time in seconds
sim_inst_rate         1481481.4889 # simulation speed (in insts/sec)
sim_total_insn        211078857 # total number of instructions executed
sim_total_refs        68853761 # total number of loads and stores executed
sim_total_loads       48245602 # total number of loads executed
sim_total_stores      20608159.0000 # total number of stores executed
sim_total_branches    55093697 # total number of branches executed
sim_cycle             122551750 # total simulation time in cycles
sim_IPC               1.6320 # instructions per cycle
sim_CPI               0.6128 # cycles per instruction
sim_exec_BW           1.7224 # total instructions (mis-spec + committed)
per cycle
sim_IPB               3.7978 # instruction per branch
IFQ_count             317746931 # cumulative IFQ occupancy
IFQ_fcount            67508608 # cumulative IFQ full count
ifq_occupancy         2.5928 # avg IFQ occupancy (insn's)
ifq_rate              1.7224 # avg IFQ dispatch rate (insn/cycle)
ifq_latency           1.5053 # avg IFQ occupant latency (cycle's)
ifq_full              0.5509 # fraction of time (cycle's) IFQ was full
RUU_count             1269546226 # cumulative RUU occupancy
RUU_fcount            36707854 # cumulative RUU full count
ruu_occupancy         10.3593 # avg RUU occupancy (insn's)
ruu_rate              1.7224 # avg RUU dispatch rate (insn/cycle)
ruu_latency           6.0146 # avg RUU occupant latency (cycle's)
ruu_full              0.2995 # fraction of time (cycle's) RUU was full
LSQ_count             413432924 # cumulative LSQ occupancy
LSQ_fcount            13267035 # cumulative LSQ full count
lsq_rate              3.3735 # avg LSQ occupancy (insn's)
lsq_rate              1.7224 # avg LSQ dispatch rate (insn/cycle)
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↻ En ✡ 5:42 PM ⓘ
hits/JRs seen)
bpred_bimod.bpred_jr_non_ras_rate.PP   1.0000 # non-RAS JR addr-pred rate (ie,
non-RAS JR hits/JRs seen)
bpred_bimod.retstack_pushes      2136231 # total number of address pushed onto r
et-addr stack
bpred_bimod.retstack_pops       2075848 # total number of address popped off of r
et-addr stack
bpred_bimod.used_ras.PP        2075812 # total number of RAS predictions used
bpred_bimod.ras_hits.PP        2075807 # total number of RAS hits
bpred_bimod.ras_rate.PP        1.0000 # RAS prediction rate (i.e., RAS hits/used RAS
)
il1.accesses                 219679988 # total number of accesses
il1.hits                      214638596 # total number of hits
il1.misses                    5041392 # total number of misses
il1.replacements              5041177 # total number of replacements
il1.writebacks                0 # total number of writebacks
il1.invalidations             0 # total number of invalidations
il1.miss_rate                 0.0229 # miss rate (i.e., misses/ref)
il1.repl_rate                  0.0229 # replacement rate (i.e., repls/ref)
il1.wb_rate                    0.0000 # writeback rate (i.e., wrbks/ref)
il1.inv_rate                   0.0000 # invalidation rate (i.e., invs/ref)
dl1.accesses                 65921502 # total number of accesses
dl1.hits                      64697138 # total number of hits
dl1.misses                    1224364 # total number of misses
dl1.replacements              1224332 # total number of replacements
dl1.writebacks                320605 # total number of writebacks
dl1.invalidations             0 # total number of invalidations
dl1.miss_rate                 0.0186 # miss rate (i.e., misses/ref)
dl1.repl_rate                  0.0186 # replacement rate (i.e., repls/ref)
dl1.wb_rate                    0.0049 # writeback rate (i.e., wrbks/ref)
dl1.inv_rate                   0.0000 # invalidation rate (i.e., invs/ref)
ul2.accesses                 6586361 # total number of accesses
ul2.hits                      6563909 # total number of hits
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 5:46 PM
mem.ptab_accesses      3277006788 # total page table accesses
mem.ptab_miss_rate     0.0000 # first level page table miss rate

ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim
-outorder -cache:dl1 dl1:16:32:4:l -fastfwd 300000000 -max:inst 200000000 equake
.ss<equake.in
sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of August, 2003.
Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.
All Rights Reserved. This version of SimpleScalar is licensed for academic
non-commercial use. No portion of this work may be used by any commercial
entity, or for any commercial purpose, without the prior written permission
of SimpleScalar, LLC (info@simplescalar.com).

sim: command line: ./sim-outorder -cache:dl1 dl1:16:32:4:l -fastfwd 300000000 -m
ax:inst 200000000 equake.ss

sim: simulation started @ Mon Nov  7 17:43:47 2016, options follow:

simulator implements a very detailed out-of-order issue
processor with a two-level memory system and speculative
execution support. This simulator is a performance simulator, tracking the
latency of all pipeline operations.

# -config           # load configuration from a file
# -dumpconfig       # dump configuration to a file
# -h                false # print help message
# -v                false # verbose operation
# -d                false # enable debug message
# -i                false # start in Dlite debugger
-seed              1 # random number generator seed (0 for timer seed)
# -q                false # initialize and terminate immediately
# -ckpt             <null> # restore EIO trace execution from <fname>
# -redir:sim        <null> # redirect simulator output to file (non-interacti
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 5:47 PM
sim_total_refs          68853203 # total number of loads and stores executed
sim_total_loads          48245602 # total number of loads executed
sim_total_stores          20607601.0000 # total number of stores executed
sim_total_branches         55093698 # total number of branches executed
sim_cycle                 119351953 # total simulation time in cycles
sim_IPC                   1.6757 # instructions per cycle
sim_CPI                   0.5968 # cycles per instruction
sim_exec_BW                 1.7691 # total instructions (mis-spec + committed)

Firefox Web Browser
sim_ifq_ldrop            3.7978 # instruction per branch
IFQ_count                  304220550 # cumulative IFQ occupancy
IFQ_fcount                  64099963 # cumulative IFQ full count
ifq_occupancy               2.5489 # avg IFQ occupancy (insn's)
ifq_rate                     1.7691 # avg IFQ dispatch rate (insn/cycle)
ifq_latency                  1.4408 # avg IFQ occupant latency (cycle's)
ifq_full                      0.5371 # fraction of time (cycle's) IFQ was full
RUU_count                  1218205546 # cumulative RUU occupancy
RUU_fcount                  33164282 # cumulative RUU full count
ruu_occupancy                10.2068 # avg RUU occupancy (insn's)
ruu_rate                       1.7691 # avg RUU dispatch rate (insn/cycle)
ruu_latency                  5.7697 # avg RUU occupant latency (cycle's)
ruu_full                      0.2779 # fraction of time (cycle's) RUU was full
LSQ_count                  392946759 # cumulative LSQ occupancy
LSQ_fcount                  12198745 # cumulative LSQ full count
lsq_occupancy                3.2923 # avg LSQ occupancy (insn's)
lsq_rate                        1.7691 # avg LSQ dispatch rate (insn/cycle)
lsq_latency                  1.8611 # avg LSQ occupant latency (cycle's)
lsq_full                      0.1022 # fraction of time (cycle's) LSQ was full
sim_slip                     1816325076 # total number of slip cycles
avg_sim_slip                  9.0816 # the average slip between issue and retirem
ent
bpred_bimod.lookups          55978024 # total number of bpred lookups
bpred_bimod.updates          52662103 # total number of updates
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  En 5:50 PM 
et-addr stack
bpred_bimod.used_ras.PP      2075812 # total number of RAS predictions used
bpred_bimod.ras_hits.PP       2075807 # total number of RAS hits
bpred_bimod.ras_rate.PP       1.0000 # RAS prediction rate (i.e., RAS hits/used RAS
)
il1.accesses                  219740359 # total number of accesses
il1.hits                      214698967 # total number of hits
il1.misses                    5041392 # total number of misses
il1.replacements               5041177 # total number of replacements
il1.writebacks                 0 # total number of writebacks
il1.invalidations              0 # total number of invalidations
il1.miss_rate                 0.0229 # miss rate (i.e., misses/ref)
il1.repl_rate                  0.0229 # replacement rate (i.e., repls/ref)
il1.wb_rate                    0.0000 # writeback rate (i.e., wrbks/ref)
il1.inv_rate                   0.0000 # invalidation rate (i.e., invs/ref)
d11.accesses                  65838160 # total number of accesses
d11.hits                      65665139 # total number of hits
d11.misses                    173021 # total number of misses
d11.replacements               172957 # total number of replacements
d11.writebacks                 59525 # total number of writebacks
d11.invalidations              0 # total number of invalidations
d11.miss_rate                 0.0026 # miss rate (i.e., misses/ref)
d11.repl_rate                  0.0026 # replacement rate (i.e., repls/ref)
d11.wb_rate                    0.0009 # writeback rate (i.e., wrbks/ref)
d11.inv_rate                   0.0000 # invalidation rate (i.e., invs/ref)
ul2.accesses                  5273938 # total number of accesses
ul2.hits                      5251485 # total number of hits
ul2.misses                    22453 # total number of misses
ul2.replacements               18357 # total number of replacements
ul2.writebacks                 15045 # total number of writebacks
ul2.invalidations              0 # total number of invalidations
ul2.miss_rate                 0.0043 # miss rate (i.e., misses/ref)
ul2.repl_rate                  0.0035 # replacement rate (i.e., repls/ref)
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0
```



```
sim: ** simulation statistics **
sim_num_insn          200000001 # total number of instructions committed
sim_num_refs           65174179 # total number of loads and stores committed
sim_num_loads          45457319 # total number of loads committed
sim_num_stores          19716860.0000 # total number of stores committed
sim_num_branches        52662105 # total number of branches committed
sim_elapsed_time        135 # total simulation time in seconds
sim_inst_rate           1481481.4889 # simulation speed (in insts/sec)
sim_total_insn          211142642 # total number of instructions executed
sim_total_refs          68852990 # total number of loads and stores executed
sim_total_loads         48245602 # total number of loads executed
sim_total_stores         20607388.0000 # total number of stores executed
sim_total_branches       55093698 # total number of branches executed
sim_cycle               119193635 # total simulation time in cycles
sim_IPC                 1.6779 # instructions per cycle
sim_CPI                 0.5960 # cycles per instruction
sim_exec_BW              1.7714 # total instructions (mis-spec + committed)
per_cycle
sim_IPB                 3.7978 # instruction per branch
IFQ_count               303582955 # cumulative IFQ occupancy
IFQ_fcount              63941351 # cumulative IFQ full count
ifq_occupancy            2.5470 # avg IFQ occupancy (insn's)
ifq_rate                 1.7714 # avg IFQ dispatch rate (insn/cycle)
ifq_latency              1.4378 # avg IFQ occupant latency (cycle's)
ifq_full                  0.5364 # fraction of time (cycle's) IFQ was full
RUU_count                1215597637 # cumulative RUU occupancy
RUU_fcount               33012744 # cumulative RUU full count
ruu_occupancy             10.1985 # avg RUU occupancy (insn's)
ruu_rate                  1.7714 # avg RUU dispatch rate (insn/cycle)
ruu_latency                5.7572 # avg RUU occupant latency (cycle's)
ruu_full                   0.2770 # fraction of time (cycle's) RUU was full
LSQ_count                391753063 # cumulative LSQ occupancy
```

```

ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↪ En ✎ 5:56 PM ⚙
  ill1.wb_rate          0.0000 # writeback rate (i.e., wrbks/ref)
  ill1.inv_rate         0.0000 # invalidation rate (i.e., invs/ref)
  dl1.accesses          65831895 # total number of accesses
  dl1.hits              65726273 # total number of hits
  dl1.misses            105622 # total number of misses
  dl1.replacements      105494 # total number of replacements
  dl1.writebacks        40232 # total number of writebacks
  dl1.invalidations    0 # total number of invalidations
  dl1.miss_rate         0.0016 # miss rate (i.e., misses/ref)
  dl1.repl_rate         0.0016 # replacement rate (i.e., repls/ref)
  dl1.wb_rate           0.0006 # writeback rate (i.e., wrbks/ref)
  dl1.invs_rate         0.0000 # invalidation rate (i.e., invs/ref)
  ul2.accesses          5187246 # total number of accesses
  ul2.hits              5164791 # total number of hits
  ul2.misses            22455 # total number of misses
  ul2.replacements      18359 # total number of replacements
  ul2.writebacks        15037 # total number of writebacks
  ul2.invalidations    0 # total number of invalidations
  ul2.miss_rate         0.0043 # miss rate (i.e., misses/ref)
  ul2.repl_rate         0.0035 # replacement rate (i.e., repls/ref)
  ul2.wb_rate           0.0029 # writeback rate (i.e., wrbks/ref)
  ul2.invs_rate         0.0000 # invalidation rate (i.e., invs/ref)
  itlb.accesses          219743002 # total number of accesses
  itlb.hits              219742990 # total number of hits
  itlb.misses            12 # total number of misses
  itlb.replacements      0 # total number of replacements
  itlb.writebacks        0 # total number of writebacks
  itlb.invalidations    0 # total number of invalidations
  itlb.miss_rate         0.0000 # miss rate (i.e., misses/ref)
  itlb.repl_rate         0.0000 # replacement rate (i.e., repls/ref)
  itlb.wb_rate           0.0000 # writeback rate (i.e., wrbks/ref)
  itlb.invs_rate         0.0000 # invalidation rate (i.e., invs/ref)
  dtlb.accesses          67017250 # total number of accesses

```

```

ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↪ En ✎ 6:02 PM ⚙
  mem.ptab_accesses     3277258844 # total page table accesses
  mem.ptab_miss_rate    0.0000 # first level page table miss rate
  ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim
  .outorder -cache:dl1 dl1:64:32:4:l -fastfwd 300000000 -max:inst 2000000000 quake
  .ss<quake.in
  sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of August, 2003.
  Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.
  All Rights Reserved. This version of SimpleScalar is licensed for academic
  non-commercial use. No portion of this work may be used by any commercial
  entity, or for any commercial purpose, without the prior written permission
  of SimpleScalar, LLC (info@simplescalar.com).
  sim: command line: ./sim-outorder -cache:dl1 dl1:64:32:4:l -fastfwd 300000000 -m
  ax:inst 2000000000 quake.ss
  sim: simulation started @ Mon Nov  7 17:57:25 2016, options follow:
  sim-outorder: This simulator implements a very detailed out-of-order issue
  superscalar processor with a two-level memory system and speculative
  execution support. This simulator is a performance simulator, tracking the
  latency of all pipeline operations.
  # -config                      # load configuration from a file
  # -dumpconfig                   # dump configuration to a file
  # -h                           false # print help message
  # -v                           false # verbose operation
  # -d                           false # enable debug message
  # -i                           false # start in Dlite debugger
  # -seed                         1 # random number generator seed (0 for timer seed)
  # -?                            false # initialize and terminate immediately
  <null> # restore EIO trace execution from <fname>
  # -redir:sim                    <null> # redirect simulator output to file (non-interacti

```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  En 6:04 PM 
sim: ** simulation statistics **
sim_num_insn          200000001 # total number of instructions committed
sim_num_refs           65174179 # total number of loads and stores committed
sim_num_loads          45457319 # total number of loads committed
sim_num_stores          19716860.0000 # total number of stores committed
sim_num_branches        52662105 # total number of branches committed
sim_elapsed_time         134 # total simulation time in seconds
sim_inst_rate           1492537.3209 # simulation speed (in insts/sec)
sim_total_insn          211147964 # total number of instructions executed
sim_total_refs           68852990 # total number of loads and stores executed
sim_total_loads          48245602 # total number of loads executed
sim_total_stores          20607388.0000 # total number of stores executed
sim_total_branches        55093698 # total number of branches executed
sim_cycle                119114726 # total simulation time in cycles
sim_IPC                  1.6791 # instructions per cycle
sim_CPI                  0.5956 # cycles per instruction
sim_exec_BW               1.7726 # total instructions (mis-spec + committed)
per_cycle
sim_IPB                  3.7978 # instruction per branch
IFQ_count                303258094 # cumulative IFQ occupancy
IFQ_fcount                63861726 # cumulative IFQ full count
ifq_occupancy             2.5459 # avg IFQ occupancy (insn's)
ifq_rate                   1.7726 # avg IFQ dispatch rate (insn/cycle)
ifq_latency                 1.4362 # avg IFQ occupant latency (cycle's)
ifq_full                    0.5361 # fraction of time (cycle's) IFQ was full
RUU_count                1213954138 # cumulative RUU occupancy
RUU_fcount                32891965 # cumulative RUU full count
ruu_occupancy              10.1915 # avg RUU occupancy (insn's)
ruu_rate                     1.7726 # avg RUU dispatch rate (insn/cycle)
ruu_latency                  5.7493 # avg RUU occupant latency (cycle's)
ruu_full                     0.2761 # fraction of time (cycle's) RUU was full
LSQ_count                391446024 # cumulative LSQ occupancy
LSQ_fcount                12123286 # cumulative LSQ full count
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  En 6:05 PM 
et-addr stack
bpred_bimod.used_ras.PP      2075812 # total number of RAS predictions used
bpred_bimod.ras_hits.PP       2075807 # total number of RAS hits
bpred_bimod.ras_rate.PP       1.0000 # RAS prediction rate (i.e., RAS hits/used RAS
)
ill.accesses                 219748324 # total number of accesses
ill.hits                      214706932 # total number of hits
ill.misses                    5041392 # total number of misses
ill.replacements              5041177 # total number of replacements
ill.writebacks                 0 # total number of writebacks
ill.invalidations              0 # total number of invalidations
ill.miss_rate                  0.0229 # miss rate (i.e., misses/ref)
ill.repl_rate                  0.0229 # replacement rate (i.e., repls/ref)
ill.wb_rate                     0.0000 # writeback rate (i.e., wrbks/ref)
ill.inv_rate                     0.0000 # invalidation rate (i.e., invs/ref)
dl1.accesses                  65828598 # total number of accesses
dl1.hits                      65764554 # total number of hits
dl1.misses                    64044 # total number of misses
dl1.replacements              63788 # total number of replacements
dl1.writebacks                 38035 # total number of writebacks
dl1.invalidations              0 # total number of invalidations
dl1.miss_rate                  0.0010 # miss rate (i.e., misses/ref)
dl1.repl_rate                  0.0010 # replacement rate (i.e., repls/ref)
dl1.wb_rate                     0.0006 # writeback rate (i.e., wrbks/ref)
dl1.inv_rate                     0.0000 # invalidation rate (i.e., invs/ref)
ul2.accesses                  5143471 # total number of accesses
ul2.hits                      5121016 # total number of hits
ul2.misses                    22455 # total number of misses
ul2.replacements              18359 # total number of replacements
ul2.writebacks                 15031 # total number of writebacks
ul2.invalidations              0 # total number of invalidations
ul2.miss_rate                  0.0044 # miss rate (i.e., misses/ref)
ul2.repl_rate                  0.0036 # replacement rate (i.e., repls/ref)
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  6:25 PM
mem.page_count          1175 # total number of pages allocated
mem.page_mem             4700k # total size of memory pages allocated
mem.ptab_misses         6279 # total first level page table misses
mem.ptab_accesses       3277280132 # total page table accesses
mem.ptab_miss_rate      0.0000 # first level page table miss rate

ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim
-outorder -cache:d1 dl1:128:32:4:l -fastfwd 300000000 -max:inst 200000000 equak
  Firefox Web Browser
  SIm-outorder . SimpleScalar/PISA Tool Set version 3.0 of August, 2003.
  Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.
  All Rights Reserved. This version of SimpleScalar is licensed for academic
  non-commercial use. No portion of this work may be used by any commercial
  entity, or for any commercial purpose, without the prior written permission
  of SimpleScalar, LLC (info@simplescalar.com).

sim: command line: ./sim-outorder -cache:d1 dl1:128:32:4:l -fastfwd 300000000 -
max:inst 200000000 equake.ss

sim: simulation started @ Mon Nov  7 18:12:27 2016, options follow:

sim-outorder: This simulator implements a very detailed out-of-order issue
superscalar processor with a two-level memory system and speculative
execution support. This simulator is a performance simulator, tracking the
latency of all pipeline operations.

# -config                      # load configuration from a file
# -dumpconfig                   # dump configuration to a file
# -h                            false # print help message
# -v                            false # verbose operation
# -d                            false # enable debug message
# -i                            false # start in Dlite debugger
-seed                           1 # random number generator seed (0 for timer seed)
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  6:27 PM
sim: ** starting performance simulation **

sim: ** simulation statistics **
sim_num_insn            200000001 # total number of instructions committed
sim_num_refs             65174179 # total number of loads and stores committed
sim_num_loads            45457319 # total number of loads committed
sim_num_stores           19716860.0000 # total number of stores committed
sim_num_branches         52662105 # total number of branches committed
sim_elapsed_time         135 # total simulation time in seconds
sim_inst_rate            14821481.4889 # simulation speed (in insts/sec)
sim_total_insn           211150190 # total number of instructions executed
sim_total_refs           68852990 # total number of loads and stores executed
sim_total_loads          48245602 # total number of loads executed
sim_total_stores          20607388.0000 # total number of stores executed
sim_total_branches        55093698 # total number of branches executed
sim_cycle                119080810 # total simulation time in cycles
sim_IPC                 1.6795 # instructions per cycle
sim_CPI                 0.5954 # cycles per instruction
sim_exec_BW              1.7732 # total instructions (mis-spec + committed)

per_cycle
sim_IPB                 3.7978 # instruction per branch
IFQ_count               303118298 # cumulative IFQ occupancy
IFQ_fcount              63827683 # cumulative IFQ full count
ifq_occupancy            2.5455 # avg IFQ occupancy (insn's)
ifq_rate                 1.7732 # avg IFQ dispatch rate (insn/cycle)
ifq_latency              1.4356 # avg IFQ occupant latency (cycle's)
ifq_full                 0.5360 # fraction of time (cycle's) IFQ was full
RUU_count               1213239777 # cumulative RUU occupancy
RUU_fcount              32840016 # cumulative RUU full count
ruu_occupancy            10.1884 # avg RUU occupancy (insn's)
ruu_rate                  1.7732 # avg RUU dispatch rate (insn/cycle)
ruu_latency              5.7459 # avg RUU occupant latency (cycle's)
ruu_full                 0.2758 # fraction of time (cycle's) RUU was full
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↪ En 🔍 6:29 PM
et-addr stack
bpred_bimod.used_ras.PP      2075812 # total number of RAS predictions used
bpred_bimod.ras_hits.PP       2075807 # total number of RAS hits
bpred_bimod.ras_rate.PP       1.0000 # RAS prediction rate (i.e., RAS hits/used RAS
)
il1.accesses                  219750550 # total number of accesses
il1.hits                      214709158 # total number of hits
il1.misses                    5041392 # total number of misses
il1.replacements               5041177 # total number of replacements
il1.writebacks                 0 # total number of writebacks
il1.invalidations              0 # total number of invalidations
il1.miss_rate                 0.0229 # miss rate (i.e., misses/ref)
il1.repl_rate                  0.0229 # replacement rate (i.e., repls/ref)
il1.wb_rate                    0.0000 # writeback rate (i.e., wrbks/ref)
il1.inv_rate                   0.0000 # invalidation rate (i.e., invs/ref)
dl1.accesses                  65826898 # total number of accesses
dl1.hits                      65781118 # total number of hits
dl1.misses                    45780 # total number of misses
dl1.replacements               45268 # total number of replacements
dl1.writebacks                 36913 # total number of writebacks
dl1.invalidations              0 # total number of invalidations
dl1.miss_rate                 0.0007 # miss rate (i.e., misses/ref)
dl1.repl_rate                  0.0007 # replacement rate (i.e., repls/ref)
dl1.wb_rate                    0.0006 # writeback rate (i.e., wrbks/ref)
dl1.inv_rate                   0.0000 # invalidation rate (i.e., invs/ref)
ul2.accesses                  5124085 # total number of accesses
ul2.hits                      5101610 # total number of hits
ul2.misses                    22475 # total number of misses
ul2.replacements               18379 # total number of replacements
ul2.writebacks                 14987 # total number of writebacks
ul2.invalidations              0 # total number of invalidations
ul2.miss_rate                 0.0044 # miss rate (i.e., misses/ref)
ul2.repl_rate                  0.0036 # replacement rate (i.e., repls/ref)
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↻ En ✎ 8:53 PM ⓘ
sim: ** simulation statistics **
sim_num_insn          200000001 # total number of instructions committed
sim_num_refs           65174179 # total number of loads and stores committed
sim_num_loads          45457319 # total number of loads committed
sim_num_stores          19716866.0000 # total number of stores committed
sim_num_branches        52662105 # total number of branches committed
sim_elapsed_time         132 # total simulation time in seconds
sim_inst_rate           1515151.5227 # simulation speed (in insts/sec)
sim_total_insn          211150288 # total number of instructions executed
sim_total_refs           68852990 # total number of loads and stores executed
sim_total_loads          48245602 # total number of loads executed
sim_total_stores          20607388.0000 # total number of stores executed
sim_total_branches        55093698 # total number of branches executed
sim_cycle               119079211 # total simulation time in cycles
sim_IPC                 1.6796 # instructions per cycle
sim_CPI                 0.5954 # cycles per instruction
sim_exec_BW              1.7732 # total instructions (mis-spec + committed)
per_cycle
sim_IPB                 3.7978 # instruction per branch
IFQ_count               303111775 # cumulative IFQ occupancy
IFQ_fcount              63826065 # cumulative IFQ full count
ifq_occupancy            2.5455 # avg IFQ occupancy (insn's)
ifq_rate                  1.7732 # avg IFQ dispatch rate (insn/cycle)
ifq_latency                1.4355 # avg IFQ occupant latency (cycle's)
ifq_full                  0.5360 # fraction of time (cycle's) IFQ was full
RUU_count               1213210191 # cumulative RUU occupancy
RUU_fcount              32838128 # cumulative RUU full count
ruu_occupancy             10.1883 # avg RUU occupancy (insn's)
ruu_rate                  1.7732 # avg RUU dispatch rate (insn/cycle)
ruu_latency                5.7457 # avg RUU occupant latency (cycle's)
ruu_full                  0.2758 # fraction of time (cycle's) RUU was full
LSQ_count               391308703 # cumulative LSQ occupancy
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↻ En ✎ 8:54 PM ⓘ
bpred_bimod.retstack_pushes    2135974 # total number of address pushed onto r
et-addr stack
bpred_bimod.retstack_pops     2075848 # total number of address popped off of r
et-addr stack
bpred_bimod.used_ras.PP      2075812 # total number of RAS predictions used
bpred_bimod.ras_hits.PP       2075807 # total number of RAS hits
bpred_bimod.ras_rate.PP       1.0000 # RAS prediction rate (i.e., RAS hits/used RAS
)
il1.accesses                 219750648 # total number of accesses
il1.hits                      214709256 # total number of hits
il1.misses                    5041392 # total number of misses
il1.replacements              5041177 # total number of replacements
                                0 # total number of writebacks
                                0 # total number of invalidations
                                0.0229 # miss rate (i.e., misses/ref)
                                0.0229 # replacement rate (i.e., repls/ref)
                                0.0000 # writeback rate (i.e., wrbks/ref)
                                0.0000 # invalidation rate (i.e., invs/ref)
dl1.accesses                 65826907 # total number of accesses
dl1.hits                      65781837 # total number of hits
dl1.misses                    45070 # total number of misses
dl1.replacements              44046 # total number of replacements
dl1.writebacks                36426 # total number of writebacks
                                0 # total number of invalidations
                                0.0007 # miss rate (i.e., misses/ref)
                                0.0007 # replacement rate (i.e., repls/ref)
                                0.0006 # writeback rate (i.e., wrbks/ref)
                                0.0000 # invalidation rate (i.e., invs/ref)
ul2.accesses                 5122888 # total number of accesses
ul2.hits                      5100421 # total number of hits
ul2.misses                    22467 # total number of misses
ul2.replacements              18371 # total number of replacements
ul2.writebacks                14939 # total number of writebacks
```

```
Terminal File Edit View Search Terminal Help ↑ En ✎ 8:58 PM ⚙
big_endian
mem.page_count          1175 # total number of pages allocated
mem.page_mem              4700k # total size of memory pages allocated
mem.ptab_misses           6279 # total first level page table misses
mem.ptab_accesses        3277289428 # total page table accesses
mem.ptab_miss_rate        0.0000 # first level page table miss rate

ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim
-outorder -cache:dl1 dl1:512:32:4:l -fastfwd 300000000 -max:inst 200000000 equake.in
sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of August, 2003.
Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.
All Rights Reserved. This version of SimpleScalar is licensed for academic
non-commercial use. No portion of this work may be used by any commercial
entity, or for any commercial purpose, without the prior written permission
of SimpleScalar, LLC (info@simplescalar.com).

sim: command line: ./sim-outorder -cache:dl1 dl1:512:32:4:l -fastfwd 300000000 -
max:inst 200000000 equake.ss

sim: simulation started @ Mon Nov  7 20:55:54 2016, options follow:

sim-outorder: This simulator implements a very detailed out-of-order issue
superscalar processor with a two-level memory system and speculative
execution support. This simulator is a performance simulator, tracking the
latency of all pipeline operations.

# -config                      # load configuration from a file
# -dumpconfig                   # dump configuration to a file
# -h                           # print help message
# -v                           # verbose operation
# -d                           # enable debug message
# -i                           # start in Dlite debugger
```

```
ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0 ↑ En ✎ 8:59 PM ⚙
sim_num_loads            45457319 # total number of loads committed
sim_num_stores             19716860.0000 # total number of stores committed
sim_num_branches           52662105 # total number of branches committed
sim_elapsed_time            130 # total simulation time in seconds
sim_inst_rate              1538461.5462 # simulation speed (in insts/sec)
sim_total_insn             211150332 # total number of instructions executed
sim_total_refs              68852990 # total number of loads and stores executed
sim_total_loads             48245602 # total number of loads executed
sim_total_stores             20607388.0000 # total number of stores executed
sim_total_branches           55093698 # total number of branches executed
sim_cycle                  119078473 # total simulation time in cycles
sim_IPC                     1.6796 # instructions per cycle
sim_CPI                     0.5954 # cycles per instruction
sim_exec_BW                  1.7732 # total instructions (mis-spec + committed)

per_cycle
sim_IPB                     3.7978 # instruction per branch
IFQ_count                    303108720 # cumulative IFQ occupancy
IFQ_fcount                   63825315 # cumulative IFQ full count
ifq_occupancy                 2.5455 # avg IFQ occupancy (insn's)
ifq_rate                      1.7732 # avg IFQ dispatch rate (insn/cycle)
ifq_latency                   1.4355 # avg IFQ occupant latency (cycle's)
ifq_full                      0.5360 # fraction of time (cycle's) IFQ was full
RUU_count                     1213194358 # cumulative RUU occupancy
RUU_fcount                   32836975 # cumulative RUU full count
ruu_occupancy                  10.1882 # avg RUU occupancy (insn's)
ruu_rate                       1.7732 # avg RUU dispatch rate (insn/cycle)
ruu_latency                   5.7456 # avg RUU occupant latency (cycle's)
ruu_full                      0.2758 # fraction of time (cycle's) RUU was full
LSQ_count                     391305477 # cumulative LSQ occupancy
LSQ_fcount                   12121663 # cumulative LSQ full count
lsq_occupancy                  3.2861 # avg LSQ occupancy (insn's)
lsq_rate                        1.7732 # avg LSQ dispatch rate (insn/cycle)
lsq_latency                   1.8532 # avg LSQ occupant latency (cycle's)
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↻ En ✎ 9:00 PM ⚡
et-addr stack          2075812 # total number of RAS predictions used
bpred_bimod.used_ras.PP 2075807 # total number of RAS hits
bpred_bimod.ras_hits.PP 1.0000 # RAS prediction rate (i.e., RAS hits/used RAS)
)
il1.accesses           219750692 # total number of accesses
il1.hits                214709300 # total number of hits
il1.misses              5041392 # total number of misses
il1.replacements        5041177 # total number of replacements
il1.writebacks           0 # total number of writebacks
il1.invalidations       0 # total number of invalidations
il1.miss_rate            0.0229 # miss rate (i.e., misses/ref)
il1.repl_rate             0.0229 # replacement rate (i.e., repls/ref)
il1.wb_rate               0.0000 # writeback rate (i.e., wrbks/ref)
il1.inv_rate               0.0000 # invalidation rate (i.e., invs/ref)
dl1.accesses           65826872 # total number of accesses
dl1.hits                65782244 # total number of hits
dl1.misses              44628 # total number of misses
dl1.replacements         42580 # total number of replacements
dl1.writebacks            35553 # total number of writebacks
dl1.invalidations        0 # total number of invalidations
dl1.miss_rate            0.0007 # miss rate (i.e., misses/ref)
dl1.repl_rate             0.0006 # replacement rate (i.e., repls/ref)
dl1.wb_rate               0.0005 # writeback rate (i.e., wrbks/ref)
dl1.inv_rate               0.0000 # invalidation rate (i.e., invs/ref)
ul2.accesses           5121573 # total number of accesses
ul2.hits                5099107 # total number of hits
ul2.misses              22466 # total number of misses
ul2.replacements         18370 # total number of replacements
ul2.writebacks            14870 # total number of writebacks
ul2.invalidations        0 # total number of invalidations
ul2.miss_rate            0.0044 # miss rate (i.e., misses/ref)
ul2.repl_rate             0.0036 # replacement rate (i.e., repls/ref)
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↻ En ✎ 9:04 PM ⚡
ld.environ_base          0x7fff8000 # program environment base address address
ld_target_big_endian      0 # target executable endian-ness, non-zero if
big endian
mem.page_count            1175 # total number of pages allocated
mem.page_mem               4700k # total size of memory pages allocated
mem.ptab_misses            6279 # total first level page table misses
mem.ptab_accesses          3277289604 # total page table accesses
mem.ptab_miss_rate         0.0000 # first level page table miss rate
ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ ./sim
-outorder -cache:dl1 dl1:1024:32:4:l -fastfwd 300000000 -max:inst 200000000 equa
ke.ss<quake.in
sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of August, 2003.
Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.
All Rights Reserved. This version of SimpleScalar is licensed for academic
non-commercial use. No portion of this work may be used by any commercial
entity, or for any commercial purpose, without the prior written permission
of SimpleScalar, LLC (info@simplescalar.com).
sim: command line: ./sim-outorder -cache:dl1 dl1:1024:32:4:l -fastfwd 300000000
-max:inst 200000000 quake.ss
sim: simulation started @ Mon Nov  7 21:01:21 2016, options follow:
sim-outorder: This simulator implements a very detailed out-of-order issue
superscalar processor with a two-level memory system and speculative
execution support. This simulator is a performance simulator, tracking the
latency of all pipeline operations.
# -config                      # load configuration from a file
# -dumpconfig                   # dump configuration to a file
# -h                           false # print help message
# -v                           false # verbose operation
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  9:07 PM
sim_num_stores      19716860.0000 # total number of stores committed
sim_num_branches    526662105 # total number of branches committed
sim_elapsed_time     129 # total simulation time in seconds
sim_inst_rate       1550387.6047 # simulation speed (in insts/sec)
sim_total_insn      211150332 # total number of instructions executed
sim_total_refs       68852990 # total number of loads and stores executed
sim_total_loads     48245602 # total number of loads executed
sim_total_stores    20607388.0000 # total number of stores executed
sim_total_branches   55093698 # total number of branches executed
sim_cycle           119078256 # total simulation time in cycles
sim_IPC              1.6796 # instructions per cycle
sim_CPI              0.5954 # cycles per instruction
sim_exec_BW          1.7732 # total instructions (mis-spec + committed)
per_cycle
sim_IPB              3.7978 # instruction per branch
IFQ_count            303107839 # cumulative IFQ occupancy
IFQ_fcount           63825090 # cumulative IFQ full count
ifq_occupancy        2.5455 # avg IFQ occupancy (insn's)
ifq_rate              1.7732 # avg IFQ dispatch rate (insn/cycle)
ifq_latency           1.4355 # avg IFQ occupant latency (cycle's)
ifq_full              0.5360 # fraction of time (cycle's) IFQ was full
RUU_count             1213190728 # cumulative RUU occupancy
RUU_fcount            32836729 # cumulative RUU full count
ruu_occupancy         10.1882 # avg RUU occupancy (insn's)
ruu_rate              1.7732 # avg RUU dispatch rate (insn/cycle)
ruu_latency           5.7456 # avg RUU occupant latency (cycle's)
ruu_full              0.2758 # fraction of time (cycle's) RUU was full
LSQ_count             391304591 # cumulative LSQ occupancy
LSQ_fcount            12121615 # cumulative LSQ full count
lsq_occupancy         3.2861 # avg LSQ occupancy (insn's)
lsq_rate              1.7732 # avg LSQ dispatch rate (insn/cycle)
lsq_latency           1.8532 # avg LSQ occupant latency (cycle's)
lsq_full              0.1018 # fraction of time (cycle's) LSQ was full
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  9:08 PM
il1.invalidations    0 # total number of invalidations
il1.miss_rate        0.0229 # miss rate (i.e., misses/ref)
il1.repl_rate        0.0229 # replacement rate (i.e., repls/ref)
il1.wb_rate           0.0000 # writeback rate (i.e., wrbks/ref)
il1.inv_rate          0.0000 # invalidation rate (i.e., invs/ref)
dl1.accesses          65826875 # total number of accesses
dl1.hits              65782278 # total number of hits
dl1.misses            44597 # total number of misses
dl1.replacements      40501 # total number of replacements
dl1.writebacks         33836 # total number of writebacks
dl1.invalidations     0 # total number of invalidations
dl1.miss_rate         0.0007 # miss rate (i.e., misses/ref)
dl1.repl_rate         0.0006 # replacement rate (i.e., repls/ref)
dl1.wb_rate            0.0005 # writeback rate (i.e., wrbks/ref)
dl1.inv_rate          0.0000 # invalidation rate (i.e., invs/ref)
ul2.accesses          5119825 # total number of accesses
ul2.hits              5097199 # total number of hits
ul2.misses            22626 # total number of misses
ul2.replacements      18530 # total number of replacements
ul2.writebacks         14855 # total number of writebacks
ul2.invalidations     0 # total number of invalidations
ul2.miss_rate         0.0044 # miss rate (i.e., misses/ref)
ul2.repl_rate         0.0036 # replacement rate (i.e., repls/ref)
ul2.wb_rate            0.0029 # writeback rate (i.e., wrbks/ref)
ul2.inv_rate          0.0000 # invalidation rate (i.e., invs/ref)
itlb.accesses          219750692 # total number of accesses
itlb.hits              219750680 # total number of hits
itlb.misses            12 # total number of misses
itlb.replacements      0 # total number of replacements
itlb.writebacks         0 # total number of writebacks
itlb.invalidations     0 # total number of invalidations
itlb.miss_rate         0.0000 # miss rate (i.e., misses/ref)
itlb.repl_rate         0.0000 # replacement rate (i.e., repls/ref)
```

### Performance evaluation and miss rate parameters:

The cache size, miss\_rate, IPC and CPI values are recorded from the execution of the instructions as shown in the above screenshots.

| Cache size | miss_rate | IPC    | CPI    |
|------------|-----------|--------|--------|
| 1KB        | 0.0186    | 1.632  | 0.6128 |
| 2KB        | 0.0026    | 1.6757 | 0.5968 |
| 4KB        | 0.0016    | 1.6779 | 0.596  |
| 8KB        | 0.001     | 1.6791 | 0.5956 |
| 16KB       | 0.0007    | 1.6795 | 0.5954 |
| 32KB       | 0.0007    | 1.6796 | 0.5954 |
| 64KB       | 0.0007    | 1.6796 | 0.5954 |
| 128KB      | 0.0007    | 1.6796 | 0.5954 |





From the table and the graphs, we can observe that the performance of the processor is almost constant from the cache size of 16 KB with higher IPC and less miss rate. However, the IPC value of the cache size 32KB is marginally more than that of 16 KB. Hence the cache size **32 KB is the optimal L1 data cache size.**

4. The performance variation of the program for every 10 million instructions during the execution of the first 500 million instructions need to be found out.

**For loop** has been used to execute the instructions. The 100 million instructions are taken at once and are incremented by 10 million instructions. The same has been performed for 5 times to complete the execution of 500 million instructions. 10 million instructions are fast forwarded in each instruction.

The commands used are as follows:

```
For((x=0;x<100000000;x+10000000)); do ./sim-outorder -fastfwd $x -max:inst 10000000
quake.ss<quake.in; done
```

```
For((x=100000001;x<200000000;x+10000000)); do ./sim-outorder -fastfwd $x -max:inst 10000000
quake.ss<quake.in; done
```

```
For((x=200000001;x<300000000;x+10000000)); do ./sim-outorder -fastfwd $x -max:inst 10000000
quake.ss<quake.in; done
```

```
For((x=200000001;x<400000000;x+10000000)); do ./sim-outorder -fastfwd $x -max:inst 10000000
quake.ss<quake.in; done
```

```
For((x=400000001;x<500000000;x+10000000)); do ./sim-outorder -fastfwd $x -max:inst 10000000
quake.ss<quake.in; done
```

The sample screenshots for the execution are as shown below:

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0  ↻ En ⚡ 2:17 AM ⓘ
dtlb.accesses          3767983 # total number of accesses
dtlb.hits              3767955 # total number of hits
dtlb.misses            28 # total number of misses
dtlb.replacements      0 # total number of replacements
dtlb.writebacks         0 # total number of writebacks
dtlb.invalidations    0 # total number of invalidations
dtlb.miss_rate         0.0000 # miss rate (i.e., misses/ref)
dtlb.repl_rate          0.0000 # replacement rate (i.e., repls/ref)
dtlb.wb_rate             0.0000 # writeback rate (i.e., wrbks/ref)
dtlb.inv_rate            0.0000 # invalidation rate (i.e., invs/ref)
sim_invalid_addrs     0 # total non-speculative bogus addresses seen
(debug var)
ld_text_base           0x00400000 # program text (code) segment base
ld_text_size            132784 # program text (code) size in bytes
ld_data_base            0x10000000 # program initialized data segment base
ld_data_size             16384 # program init'ed '.data' and uninit'ed '.bs
s'size in bytes
ld_stack_base           0x7fffcc000 # program stack segment base (highest address)
s'in stack)
ld_stack_size            16384 # program initial stack size
ld_prog_entry           0x00400140 # program entry point (initial PC)
ld_environ_base          0x7ffff8000 # program environment base address address
ld_target_big_endian    0 # target executable endian-ness, non-zero if
big endian
mem.page_count          328 # total number of pages allocated
mem.page_mem              1312k # total size of memory pages allocated
mem.ptab_misses          328 # total first level page table misses
mem.ptab_accesses        678336654 # total page table accesses
mem.ptab_miss_rate       0.0000 # first level page table miss rate
ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ for((x=0; x<100000000; x=x+10000000)); do ./sim-outorder -fastfwd $x -max:inst 100000000 quake.ss < quake.in; done
```

```
Terminal File Edit View Search Terminal Help  ↻ En ⚡ 2:20 AM ⓘ
mem.page_count          1117 # total number of pages allocated
mem.page_mem              4468k # total size of memory pages allocated
mem.ptab_misses          6221 # total first level page table misses
mem.ptab_accesses        1966987028 # total page table accesses
mem.ptab_miss_rate       0.0000 # first level page table miss rate
ramkiran@ramkiran-virtual-machine:~/Desktop/Simplescalar/rk/simplesim-3.0$ for((x=100000001; x<200000000; x=x+10000000)); do ./sim-outorder -fastfwd $x -max:inst 100000000 quake.ss < quake.in; done
sim-outorder: SimpleScalar/PISA Tool Set version 3.0 of August, 2003.
Copyright (c) 1994-2003 by Todd M. Austin, Ph.D. and SimpleScalar, LLC.
All Rights Reserved. This version of SimpleScalar is licensed for academic
non-commercial use. No portion of this work may be used by any commercial
entity, or for any commercial purpose, without the prior written permission
of SimpleScalar, LLC (info@simplescalar.com).
sim: command line: ./sim-outorder -fastfwd 100000001 -max:inst 100000000 quake.ss
sim: simulation started @ Tue Nov  8 01:28:30 2016, options follow:
sim-outorder: This simulator implements a very detailed out-of-order issue
superscalar processor with a two-level memory system and speculative
execution support. This simulator is a performance simulator, tracking the
latency of all pipeline operations.
# -config                      # load configuration from a file
# -dumpconfig                   # dump configuration to a file
# -h                           false # print help message
# -v                           false # verbose operation
# -d                           false # enable debug message
# -i                           false # start in Dlite debugger
-seed                          1 # random number generator seed (0 for timer seed)
```

```
ramkrishna@ramkrishna-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 1:25 AM
sim: ** fast forwarding 290000001 insts **
quake00: Reading nodes.
quake00: Reading elements.
sim: ** starting performance simulation **

sim: ** simulation statistics **
sim_num_insn          10000000 # total number of instructions committed
sim_num_refs           3286724 # total number of loads and stores committed
sim_num_loads          2287948 # total number of loads committed
sim_num_stores          998776.0000 # total number of stores committed
sim_num_branches        2625356 # total number of branches committed
sim_elapsed_time         21 # total simulation time in seconds
sim_inst_rate          476190.4762 # simulation speed (in insts/sec)
sim_total_insn          10577039 # total number of instructions executed
sim_total_refs          3477052 # total number of loads and stores executed
sim_total_loads          2432207 # total number of loads executed
sim_total_stores          1044845.0000 # total number of stores executed
sim_total_branches        2751156 # total number of branches executed
sim_cycle                6010839 # total simulation time in cycles
sim_IPC                  1.6637 # instructions per cycle
sim_CPI                  0.6011 # cycles per instruction
sim_exec_BW              1.7597 # total instructions (mis-spec + committed)

per cycle
sim_IPB                  3.8090 # instruction per branch
IFQ_count                15135085 # cumulative IFQ occupancy
IFQ_fcount                3183125 # cumulative IFQ full count
ifq_occupancy             2.5180 # avg IFQ occupancy (insn's)
ifq_rate                   1.7597 # avg IFQ dispatch rate (insn/cycle)
ifq_latency                 1.4309 # avg IFQ occupant latency (cycle's)
ifq_full                   0.5296 # fraction of time (cycle's) IFQ was full
RUU_count                60603618 # cumulative RUU occupancy
RUU_fcount                1608421 # cumulative RUU full count
ruu_occupancy              10.0824 # avg RUU occupancy (insn's)
```

```
Terminal File Edit View Search Terminal Help 1:26 AM
A unified l2 cache (il2 is pointed at dl2):
-cache:il1 il1:128:64:1:l -cache:il2 dl2
-cache:dl1 dl1:256:32:1:l -cache:dl2 ul2:1024:64:2:l

Or, a fully unified cache hierarchy (il1 pointed at dl1):
-cache:il1 dl1
-cache:dl1 ul1:256:32:1:l -cache:dl2 ul2:1024:64:2:l

sim: ** fast forwarding 310000001 insts **
quake00: Reading nodes.
quake00: Reading elements.
sim: ** starting performance simulation **

sim: ** simulation statistics **
sim_num_insn          10000000 # total number of instructions committed
sim_num_refs           3292287 # total number of loads and stores committed
sim_num_loads          2290954 # total number of loads committed
sim_num_stores          1001333.0000 # total number of stores committed
sim_num_branches        2623825 # total number of branches committed
sim_elapsed_time         20 # total simulation time in seconds
sim_inst_rate          500000.0000 # simulation speed (in insts/sec)
sim_total_insn          10581462 # total number of instructions executed
sim_total_refs          3484005 # total number of loads and stores executed
sim_total_loads          2436230 # total number of loads executed
sim_total_stores          1047775.0000 # total number of stores executed
sim_total_branches        2750744 # total number of branches executed
sim_cycle                6021591 # total simulation time in cycles
sim_IPC                  1.6607 # instructions per cycle
sim_CPI                  0.6022 # cycles per instruction
sim_exec_BW              1.7573 # total instructions (mis-spec + committed)
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 1:27 AM
sim: ** fast forwarding 420000001 insts **
quake00: Reading nodes.
quake00: Reading elements.
sim: ** starting performance simulation **

sim: ** simulation statistics **
sim_num_insn          10000000 # total number of instructions committed
sim_num_refs           3246782 # total number of loads and stores committed
sim_num_loads          2266440 # total number of loads committed
sim_num_stores          980342.0000 # total number of stores committed
sim_num_branches        2636406 # total number of branches committed
sim_elapsed_time         25 # total simulation time in seconds
sim_inst_rate          4000000.0000 # simulation speed (in insts/sec)
sim_total_insn          10549497 # total number of instructions executed
sim_total_refs          3428027 # total number of loads and stores executed
sim_total_loads          2403812 # total number of loads executed
sim_total_stores          1024215.0000 # total number of stores executed
sim_total_branches        2756203 # total number of branches executed
sim_cycle               5938925 # total simulation time in cycles
sim_IPC                 1.6838 # instructions per cycle
sim_CPI                 0.5939 # cycles per instruction
sim_exec_BW              1.7763 # total instructions (mis-spec + committed)

per cycle
sim_IPB                 3.7930 # instruction per branch
IFQ_count                15174410 # cumulative IFQ occupancy
IFQ_fcount                3197270 # cumulative IFQ full count
ifq_occupancy              2.5551 # avg IFQ occupancy (insn's)
ifq_rate                  1.7763 # avg IFQ dispatch rate (insn/cycle)
ifq_latency                1.4384 # avg IFQ occupant latency (cycle's)
ifq_full                   0.5384 # fraction of time (cycle's) IFQ was full
RUU_count                60729001 # cumulative RUU occupancy
```

```
ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0 1:42 AM
sim: ** fast forwarding 130000001 insts **
quake00: Reading nodes.
sim: ** starting performance simulation **

sim: ** simulation statistics **
sim_num_insn          10000000 # total number of instructions committed
sim_num_refs           3699831 # total number of loads and stores committed
sim_num_loads          2460691 # total number of loads committed
sim_num_stores          1239140.0000 # total number of stores committed
sim_num_branches        1935886 # total number of branches committed
sim_elapsed_time         13 # total simulation time in seconds
sim_inst_rate          769230.7692 # simulation speed (in insts/sec)
sim_total_insn          10595314 # total number of instructions executed
sim_total_refs          3862307 # total number of loads and stores executed
sim_total_loads          2573959 # total number of loads executed
sim_total_stores          1288348.0000 # total number of stores executed
sim_total_branches        2119052 # total number of branches executed
sim_cycle                8324443 # total simulation time in cycles
sim_IPC                 1.2013 # instructions per cycle
sim_CPI                 0.8324 # cycles per instruction
sim_exec_BW              1.2728 # total instructions (mis-spec + committed)

per cycle
sim_IPB                 5.1656 # instruction per branch
IFQ_count                18307873 # cumulative IFQ occupancy
IFQ_fcount                3959143 # cumulative IFQ full count
ifq_occupancy              2.1993 # avg IFQ occupancy (insn's)
ifq_rate                  1.2728 # avg IFQ dispatch rate (insn/cycle)
ifq_latency                1.7279 # avg IFQ occupant latency (cycle's)
ifq_full                   0.4756 # fraction of time (cycle's) IFQ was full
RUU_count                72397837 # cumulative RUU occupancy
RUU_fcount                2091510 # cumulative RUU full count
ruu_occupancy              8.6970 # avg RUU occupancy (insn's)
```



ramkiran@ramkiran-virtual-machine: ~/Desktop/Simplescalar/rk/simplesim-3.0

```

Or, a fully unified cache hierarchy (il1 pointed at dl1):
-cache:il1 dl1
-cache:dl1 ul1:256:32:1:l -cache:dl2 ul2:1024:64:2:l

sim: ** fast forwarding 50000000 insts **
equake00: Reading nodes.
sim: ** starting performance simulation **

sim: ** simulation statistics **
sim_num_insn          10000002 # total number of instructions committed
sim_num_refs           3715260 # total number of loads and stores committed
sim_num_loads          2472446 # total number of loads committed
sim_num_stores          1242814.0000 # total number of stores committed
sim_num_branches        1965167 # total number of branches committed
sim_elapsed_time         9 # total simulation time in seconds
sim_inst_rate          1111111.3333 # simulation speed (in insts/sec)
sim_total_insn          10645204 # total number of instructions executed
sim_total_refs          3881982 # total number of loads and stores executed
sim_total_loads          2589081 # total number of loads executed
sim_total_stores         1292901.0000 # total number of stores executed
sim_total_branches       2159425 # total number of branches executed
sim_cycle                8232381 # total simulation time in cycles
sim_IPC                  1.2147 # instructions per cycle
sim_CPI                  0.8232 # cycles per instruction
sim_exec_BW               1.2931 # total instructions (mis-spec + committed)
per_cycle                5.0886 # instruction per branch
sim_IPB                 18220711 # cumulative IFQ occupancy
IFQ_count                3935098 # cumulative IFQ full count
IFQ_fcount                2.2133 # avg IFQ occupancy (insn's)

```

From the execution of the five commands, we get 50 sets of data for CPI and IPC which has been recorded in the table as below

| Instruction # (in millions) | IPC    | CPI    |
|-----------------------------|--------|--------|
| 10                          | 1.3036 | 0.7671 |
| 20                          | 1.209  | 0.8271 |
| 30                          | 1.2153 | 0.8228 |
| 40                          | 1.2121 | 0.825  |
| 50                          | 1.2175 | 0.8214 |
| 60                          | 1.2147 | 0.8232 |
| 70                          | 1.2167 | 0.8219 |
| 80                          | 1.2174 | 0.8214 |
| 90                          | 1.2154 | 0.8228 |
| 100                         | 1.2105 | 0.8261 |
| 110                         | 1.1972 | 0.8353 |
| 120                         | 1.2001 | 0.8333 |
| 130                         | 1.1992 | 0.8339 |
| 140                         | 1.2013 | 0.8324 |
| 150                         | 1.1976 | 0.835  |
| 160                         | 1.1998 | 0.8334 |
| 170                         | 1.1883 | 0.8415 |
| 180                         | 0.9855 | 1.0147 |
| 190                         | 1.1958 | 0.8363 |

|     |        |        |
|-----|--------|--------|
| 200 | 1.6368 | 0.611  |
| 210 | 1.6544 | 0.6045 |
| 220 | 1.6587 | 0.6029 |
| 230 | 1.6646 | 0.6008 |
| 240 | 1.6657 | 0.6004 |
| 250 | 1.6653 | 0.6005 |
| 260 | 1.6657 | 0.6003 |
| 270 | 1.6655 | 0.6004 |
| 280 | 1.666  | 0.6002 |
| 290 | 1.6646 | 0.6008 |
| 300 | 1.6637 | 0.6011 |
| 310 | 1.6643 | 0.6009 |
| 320 | 1.6607 | 0.6022 |
| 330 | 1.6639 | 0.601  |
| 340 | 1.6656 | 0.6004 |
| 350 | 1.6685 | 0.5993 |
| 360 | 1.6815 | 0.5947 |
| 370 | 1.6812 | 0.5948 |
| 380 | 1.6819 | 0.5946 |
| 390 | 1.6832 | 0.5941 |
| 400 | 1.684  | 0.5938 |
| 410 | 1.6817 | 0.5946 |
| 420 | 1.6822 | 0.5944 |
| 430 | 1.6838 | 0.5939 |
| 440 | 1.6834 | 0.594  |
| 450 | 1.6816 | 0.5947 |
| 460 | 1.6832 | 0.5941 |
| 470 | 1.6836 | 0.594  |
| 480 | 1.6805 | 0.5951 |
| 490 | 1.6798 | 0.5953 |
| 500 | 1.6812 | 0.5948 |

The performance chart has been plotted for the instructions and PCI, IPC values as shown in the chart below.

