Trace analyzer space usage #29

fedorova · 2023-10-05T20:59:24Z

Hi folks, I am using the trace analyzer for reuse distance on a 371MB trace. The analyzer keeps running, but can never finish, because it runs out of disk space. For example, last time I tried my .reuseWindow_w300_rt file grew to 91GB, filling out the remaining space in my file system. At that point, trace analyzer hung, not quitting or producing any info messages.

Is it normal to generate 91GB+ of .reuseWindow_w300_rt for a 371MB trace? Is there a way to run the trace analyzer differently so that it actually completes?

1a1a11a · 2023-10-06T02:21:15Z

no, it is not common, can you show the command you use?

1a1a11a · 2023-10-06T02:44:47Z

A few other comments:

the traceAnalyzer was merged from my other work recently and did not have any test, so it may have bugs, sorry...
if you just need reuse distance, then you can disable the window-based calculation by using --common instead of --reuse
the reuse distance calculation in traceAnalyzer is not "stack distance", it is the number of requests / seconds between two accesses of an object.
If you need stack distance calculation, we have a tool called distUtil, and you can try ./bin/distUtil ../data/trace.vscsi vscsi stack_dist txt trace. More usage can be found using --help.

I hope this helps.

fedorova · 2023-10-06T19:17:42Z

I ran the following command:

traceAnalyzer <trace_path> csv -t "time-col=1, obj-id-col=2, obj-size-col=3" --all

But if I used --reuse instead of --all, I saw the same pattern.

The command completed on a 308MB trace after generating a 132GB file <>..csv.reuseWindow_w300_rt, but the <>.csv.reuse file has zero bytes in it, as do .csv.accessRtime, .csv.accessVtime, .csv.popularity and .csv.size

1a1a11a · 2023-10-08T09:08:24Z

I am not able to reproduce this problem, do you mind sharing a few lines of the input file?
a few other suggestions:

The csv reader is not robust and may run into problems if the trace is not well-formatted (e.g., missing delimiter), so try to print the trace using bin/tracePrint to see whether the csv trace is parsed correctly
I was wrong about --reuse, it does generate the reuseWindow file, so just use --common or take a look at the distUtil tool.

:)

fedorova · 2023-10-13T23:52:29Z

I am not able to reproduce this problem, do you mind sharing a few lines of the input file?

Here are a few lines of my csv file:

1695337878208738,4096,4096
1695337878208830,4096,4096
1695337878208853,4096,4096
1695337878208927,4096,4096
1695337878208942,4096,4096
1695337878208990,4096,4096
1695337878209258,4096,4096
1695337878209452,4096,4096
1695337878209471,4096,4096
1695337878209482,4096,4096
1695337878209580,4096,4096
1695337878209650,4096,4096
1695337878209774,4096,4096
1695337878209851,4096,4096
1695337878209866,4096,4096
1695337878209882,4096,4096
1695337878209928,4096,4096
1695337878209939,4096,4096
1695337878209949,4096,4096
1695337878209959,4096,4096
1695337878209970,4096,4096
1695337878210009,4096,4096
1695337878222642,4096,4096
1695337878275331,1237467136,4096
1695337878275356,1237467136,4096
1695337878275384,1236574208,4096

fedorova · 2023-10-13T23:54:48Z

Here is the entire gzipped trace: https://people.ece.ubc.ca/~sasha/TMP/evict-btree.csv.gz

1a1a11a · 2023-10-15T20:17:20Z

Thank you for sharing the trace!

large output
The large output is caused by long time range.
The default time unit is second and time window is 300 sec.
The trace has 120592826 seconds (I guess the time unit is not second?), which causes the number of window too large and thus the large output.
I would suggest changing the time unit to second, or change the time_window to a large value, or just ignore this computation.
incorrect results
Using the debug mode (cmake_build_type=Debug), the binary crashes at

libCacheSim/libCacheSim/traceAnalyzer/analyzer.cpp

Line 117 in 82e76bd

DEBUG_ASSERT(curr_time_window_idx == time_to_window_idx(req->clock_time));

which suggests that the trace is not time-ordered, for example, the following lines are not ordered.

1695337878371012,1236004864,4096
1695337878371038,1236008960,4096
1695337878371030,671416320,28672
1695337878371045,112283648,28672
1695337878371071,112398336,28672

After sorting the data, the analysis can finish without any issue.

One minor suggestion, when using csv traces, if the object id (e.g., block address) is numeric, adding obj-id-is-num=1 to the trace type options will reduce memory usage and run time.

I hope this helps. Thank you for reporting the issue!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trace analyzer space usage #29

Trace analyzer space usage #29

fedorova commented Oct 5, 2023

1a1a11a commented Oct 6, 2023

1a1a11a commented Oct 6, 2023

fedorova commented Oct 6, 2023

1a1a11a commented Oct 8, 2023

fedorova commented Oct 13, 2023

fedorova commented Oct 13, 2023

1a1a11a commented Oct 15, 2023

Trace analyzer space usage #29

Trace analyzer space usage #29

Comments

fedorova commented Oct 5, 2023

1a1a11a commented Oct 6, 2023

1a1a11a commented Oct 6, 2023

fedorova commented Oct 6, 2023

1a1a11a commented Oct 8, 2023

fedorova commented Oct 13, 2023

fedorova commented Oct 13, 2023

1a1a11a commented Oct 15, 2023