Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace analyzer space usage #29

Open
fedorova opened this issue Oct 5, 2023 · 7 comments
Open

Trace analyzer space usage #29

fedorova opened this issue Oct 5, 2023 · 7 comments

Comments

@fedorova
Copy link

fedorova commented Oct 5, 2023

Hi folks, I am using the trace analyzer for reuse distance on a 371MB trace. The analyzer keeps running, but can never finish, because it runs out of disk space. For example, last time I tried my .reuseWindow_w300_rt file grew to 91GB, filling out the remaining space in my file system. At that point, trace analyzer hung, not quitting or producing any info messages.

Is it normal to generate 91GB+ of .reuseWindow_w300_rt for a 371MB trace? Is there a way to run the trace analyzer differently so that it actually completes?

@1a1a11a
Copy link
Owner

1a1a11a commented Oct 6, 2023

no, it is not common, can you show the command you use?

@1a1a11a
Copy link
Owner

1a1a11a commented Oct 6, 2023

A few other comments:

  1. the traceAnalyzer was merged from my other work recently and did not have any test, so it may have bugs, sorry...
  2. if you just need reuse distance, then you can disable the window-based calculation by using --common instead of --reuse
  3. the reuse distance calculation in traceAnalyzer is not "stack distance", it is the number of requests / seconds between two accesses of an object.
  4. If you need stack distance calculation, we have a tool called distUtil, and you can try ./bin/distUtil ../data/trace.vscsi vscsi stack_dist txt trace. More usage can be found using --help.

I hope this helps.

@fedorova
Copy link
Author

fedorova commented Oct 6, 2023

I ran the following command:

traceAnalyzer <trace_path> csv -t "time-col=1, obj-id-col=2, obj-size-col=3" --all

But if I used --reuse instead of --all, I saw the same pattern.

The command completed on a 308MB trace after generating a 132GB file <>..csv.reuseWindow_w300_rt, but the <>.csv.reuse file has zero bytes in it, as do .csv.accessRtime, .csv.accessVtime, .csv.popularity and .csv.size

@1a1a11a
Copy link
Owner

1a1a11a commented Oct 8, 2023

I am not able to reproduce this problem, do you mind sharing a few lines of the input file?
a few other suggestions:

  1. The csv reader is not robust and may run into problems if the trace is not well-formatted (e.g., missing delimiter), so try to print the trace using bin/tracePrint to see whether the csv trace is parsed correctly
  2. I was wrong about --reuse, it does generate the reuseWindow file, so just use --common or take a look at the distUtil tool.

:)

@fedorova
Copy link
Author

I am not able to reproduce this problem, do you mind sharing a few lines of the input file?

Here are a few lines of my csv file:

1695337878208738,4096,4096
1695337878208830,4096,4096
1695337878208853,4096,4096
1695337878208927,4096,4096
1695337878208942,4096,4096
1695337878208990,4096,4096
1695337878209258,4096,4096
1695337878209452,4096,4096
1695337878209471,4096,4096
1695337878209482,4096,4096
1695337878209580,4096,4096
1695337878209650,4096,4096
1695337878209774,4096,4096
1695337878209851,4096,4096
1695337878209866,4096,4096
1695337878209882,4096,4096
1695337878209928,4096,4096
1695337878209939,4096,4096
1695337878209949,4096,4096
1695337878209959,4096,4096
1695337878209970,4096,4096
1695337878210009,4096,4096
1695337878222642,4096,4096
1695337878275331,1237467136,4096
1695337878275356,1237467136,4096
1695337878275384,1236574208,4096

@fedorova
Copy link
Author

Here is the entire gzipped trace: https://people.ece.ubc.ca/~sasha/TMP/evict-btree.csv.gz

@1a1a11a
Copy link
Owner

1a1a11a commented Oct 15, 2023

Thank you for sharing the trace!

  1. large output
    The large output is caused by long time range.
    The default time unit is second and time window is 300 sec.
    The trace has 120592826 seconds (I guess the time unit is not second?), which causes the number of window too large and thus the large output.
    I would suggest changing the time unit to second, or change the time_window to a large value, or just ignore this computation.

  2. incorrect results
    Using the debug mode (cmake_build_type=Debug), the binary crashes at

    DEBUG_ASSERT(curr_time_window_idx == time_to_window_idx(req->clock_time));

    which suggests that the trace is not time-ordered, for example, the following lines are not ordered.

1695337878371012,1236004864,4096
1695337878371038,1236008960,4096
1695337878371030,671416320,28672
1695337878371045,112283648,28672
1695337878371071,112398336,28672

After sorting the data, the analysis can finish without any issue.

One minor suggestion, when using csv traces, if the object id (e.g., block address) is numeric, adding obj-id-is-num=1 to the trace type options will reduce memory usage and run time.

I hope this helps. Thank you for reporting the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants