Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qDepth Support for NVM Cache #303

Open
Alphacode18 opened this issue Apr 5, 2024 · 6 comments
Open

qDepth Support for NVM Cache #303

Alphacode18 opened this issue Apr 5, 2024 · 6 comments

Comments

@Alphacode18
Copy link

Alphacode18 commented Apr 5, 2024

Hi,

I have been exploring using cachelib for an experiment, where I am using 1 cachebench thread to send requests to an underlying SSD (using nvmCache) for a certain qDepth. However, despite setting navyQDepth in my configuration file, I am observing that cachebench never goes beyond 1 in-flight request.

Here's my configuration file:

{
  "cache_config" : {
    "cacheSizeMB" : 100,
    "numPools" : 1,
    "nvmCacheSizeMB": 40960,
    "nvmCachePaths": ["/mnt/nvme0n1/cachelib/testfile"],
    "navyQDepth": 8,
    "navyEnableIoUring": false,
    "navyBlockSize": 4096
  },
  "test_config" : {
    "enableLookaside": true,
    "numThreads" : 1,
    "numKeys" : 1000000,
    "numOps" : 5000000,
    "distribution" : "range",
    "generator": "workload",
    "keySizeRange" : [16, 16],
    "keySizeRangeProbability" : [1],
    "valSizeRange" : [128, 128],
    "valSizeRangeProbability" : [1],
    "setRatio" : 0.0,
    "delRatio" : 0.0,
    "loneGetRatio" : 0.0,
    "getRatio" : 1
  }
}

I have also performed some logging within the submitIo function in cachelib/navy/common/Device.cpp file to see the number of inflight requests. Adding those here for clarity:

I0405 00:07:43.816907 305415 Device.cpp:839] [ctx_0] Submit I/O Pre Submission queue depth: 8 outstanding requests: 0; submitted requests: 162
I0405 00:07:43.816933 305415 Device.cpp:863] [ctx_0] Submit I/O Post Submission: queue depth: 8 outstanding requests: 1; submitted requests: 163
I0405 00:07:43.817930 305415 Device.cpp:799] [ctx_0] Handle Completion: outstanding requests: 1; completed requests: 164
I0405 00:07:43.817938 305415 Device.cpp:839] [ctx_0] Submit I/O Pre Submission queue depth: 8 outstanding requests: 0; submitted requests: 165
I0405 00:07:43.817968 305415 Device.cpp:863] [ctx_0] Submit I/O Post Submission: queue depth: 8 outstanding requests: 1; submitted requests: 166
I0405 00:07:43.818650 305415 Device.cpp:799] [ctx_0] Handle Completion: outstanding requests: 0; completed requests: 167

Any help would be greatly appreciated!

@jaesoo-fb
Copy link
Contributor

jaesoo-fb commented Apr 5, 2024

Hi @Alphacode18

In order to use concurrent IOs, you need to enable async multitasking at Navy layer (navy-async) by enabling NavyRequestScheduler; refer to this and this for details. Note that qdepth will be set automatically if you enable navy-async (ref).

If you override the qdepth with some value >1 without enabling navy-async, you would have hit this assertion in debug build.

@Alphacode18
Copy link
Author

Hi @jaesoo-fb

Thank you for your prompt response. I see. I am now able to set qDepth automatically by varying the navyMaxNumReads and navyMaxNumWrites, with navyReaderThreads and navyWriterThreads set to 1.

Would you know if NavyRequestScheduler is a parameter I can set using the config file? Maybe I am missing something, but I can't seem to set it correctly.

Thank you for all your help!

@jaesoo-fb
Copy link
Contributor

@Alphacode18 NavyRequestScheduler (async) as opposed to OrderedThreadPoolScheduler is activated if you provide non-zero values for navyMaxNumReads and navyMaxNumWrites. See this

@Alphacode18
Copy link
Author

Oh I see! Thanks. Unfortunately, in submitIo, I still see numOutstanding_ oscillate between 0 and 1 (similar to the log above). Is that the correct place to log? If not, could you point me to the right place? I just want to verify that I am indeed seeing numOutstanding_ roughly similar to qDepth.

Here's my updated config:

{
  "cache_config" : {
    "cacheSizeMB" : 100,
    "numPools" : 1,
    "nvmCacheSizeMB": 40960,
    "nvmCachePaths": ["/mnt/nvme0n1/cachelib/testfile"],
    "navyEnableIoUring": false,
    "navyBlockSize": 4096,
    "navyMaxNumReads": 16,
    "navyMaxNumWrites": 16,
    "navyReaderThreads": 1,
    "navyWriterThreads": 1
  },
  "test_config" : {
    "enableLookaside": true,
    "numThreads" : 1,
    "numKeys" : 1000000,
    "numOps" : 5000000,
    "distribution" : "range",
    "generator": "workload",
    "keySizeRange" : [16, 16],
    "keySizeRangeProbability" : [1],
    "valSizeRange" : [4096, 4096],
    "valSizeRangeProbability" : [1],
    "setRatio" : 0.0,
    "delRatio" : 0.0,
    "loneGetRatio" : 0.0,
    "getRatio" : 1
  }
}

@jaesoo-fb
Copy link
Contributor

@Alphacode18 Navy configuration looks correct, but stressor configuration looks not; you are using only 1 thread (numThreads), meaning there will only be at most one outstanding cachelib requests.

@Alphacode18
Copy link
Author

Alphacode18 commented Apr 5, 2024

Oh, I see. In this case, may I ask three questions:

  • How do I set qDepth independent to the number of threads (and have cachelib enforce that, i.e., maintain those many requests in-flight per thread?)
  • I observe now that qDepth is always upper-bounded by the numThreads (i.e., I can't have I/O depth 8 if numThreads = 1), how can I get rid of this upper bound?
  • Also, If I launch 8 threads with my configuration, how can I explicitly allot them particular CPU cores?

Thank you for all your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants