Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable FDP for CacheBench #302

Open
jmhands opened this issue Apr 2, 2024 · 21 comments
Open

Enable FDP for CacheBench #302

jmhands opened this issue Apr 2, 2024 · 21 comments

Comments

@jmhands
Copy link

jmhands commented Apr 2, 2024

Reading the commit notes from
009e89b

I tried to enable FDP by adding

    "enableFDP": true,
    "navyEnableIoUring": true,
    "navyQDepth": 1,

to the following. The Samsung docs I was following to enable FDP say to add "devicePlacement": true,. Which one is it?

but when I run CacheBench I see

I0402 22:15:30.163209 35634 Cache-inl.h:240]   "navyConfig::enableFDP": "0", 

and it fails at

F0402 22:15:30.316447 35634 Device.cpp:761] Check failed: !useIoUring_ && !(fdpNvmeVec_.size() > 0)                                                     *** Aborted at 1712096130 (Unix time, try 'date -d @1712096130') ***                                                                                    *** Signal 6 (SIGABRT) (0x8b32) received by PID 35634 (pthread TID 0x7aef680424c0) (linux TID 35634) (maybe from PID 35634, UID 0) (code: -6), stack trace: ***                                                                                                                                                     @ 0000000000d3bd3e folly::symbolizer::(anonymous namespace)::innerSignalHandler(int, siginfo_t*, void*)                                                                    /home/jm/CacheLib/cachelib/external/folly/folly/experimental/symbolizer/SignalHandler.cpp:449                                        @ 0000000000d3be24 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)                                                                         /home/jm/CacheLib/cachelib/external/folly/folly/experimental/symbolizer/SignalHandler.cpp:470                                        @ 000000000004251f (unknown)                                                                                                                            @ 00000000000969fc pthread_kill                                                                                                                         @ 0000000000042475 raise                                                                                                                                @ 00000000000287f2 abort  
{
    "cache_config": {
      "cacheSizeMB": 20000,
      "cacheDir": "/tmp/cachelib_metadata",
      "allocFactor": 1.08,
      "maxAllocSize": 524288,
      "minAllocSize": 64,
      "navyReaderThreads": 24,
      "navyWriterThreads": 12,
      "nvmCachePaths": ["/dev/ng0n1"],
      "nvmCacheSizeMB": 2666496,
      "writeAmpDeviceList": ["nvme0n1"],
      "navyBigHashBucketSize": 4096,
      "navyBigHashSizePct": 0,
      "navySmallItemMaxSize": 640,
      "navySegmentedFifoSegmentRatio": [1.0],
      "navyHitsReinsertionThreshold": 1,
      "navyBlockSize": 4096,
      "nvmAdmissionRetentionTimeThreshold": 7200,
      "navyParcelMemoryMB": 6048,
      "enableChainedItem": true,
      "htBucketPower": 29,
      "moveOnSlabRelease": false,
      "poolRebalanceIntervalSec": 2,
      "poolResizeIntervalSec": 2,
      "rebalanceStrategy": "hits"
    },
    "test_config": {
      "opRatePerSec": 1000000,
      "opRateBurstSize": 200,
      "enableLookaside": false,
      "generator": "replay",
      "replayGeneratorConfig": {
        "ampFactor": 200
      },
      "repeatTraceReplay": true,
      "repeatOpCount": true,
      "onlySetIfMiss": false,
      "numOps": 100000000000,
      "numThreads": 10,
      "prepopulateCache": true,
      "traceFileNames": [
        "kvcache_traces_1.csv",
        "kvcache_traces_2.csv",
        "kvcache_traces_3.csv",
        "kvcache_traces_4.csv",
        "kvcache_traces_5.csv"
      ]
    }
  }
@jaesoo-fb
Copy link
Contributor

Hi @jmhands

For FDP, you still need to provide the nvme blkdev path for nvmCachePaths in the standard format like /dev/nvmeXnY[pZ]. The char device name is derived from the blkdev name as can be seen here

If this still does not fix the issue, please upload the full log files. Thanks.

@jmhands
Copy link
Author

jmhands commented Apr 3, 2024

Hi @jmhands

For FDP, you still need to provide the nvme blkdev path for nvmCachePaths in the standard format like /dev/nvmeXnY[pZ]. The char device name is derived from the blkdev name as can be seen here

If this still does not fix the issue, please upload the full log files. Thanks.

yes this was just the example in the path, I edited for /dev/nvme0n1 which is drive with FDP enabled, which I can verify with nvme id-ctrl /dev/nvme0n1

here is the log

I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::enableFDP": "0",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::fileName": "/dev/nvme0n1",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::fileSize": "998579896320",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::ioEngine": "io_uring",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxConcurrentInserts": "1000000",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxNumReads": "0",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxNumWrites": "0",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxParcelMemoryMB": "6048",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::maxWriteRate": "0",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::navyReqOrderingShards": "21",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::raidPaths": "",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::readerThreads": "72",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::stackSize": "16384",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::truncateFile": "false",
I0403 18:25:40.640346 39787 Cache-inl.h:240]   "navyConfig::writerThreads": "36"
I0403 18:25:40.640346 39787 Cache-inl.h:240] }
I0403 18:25:40.640600 39787 Cache-inl.h:279] Failed to attach for reason: Unable to find any segment with name shm_info
E0403 18:25:41.550093 39787 NvmCacheState.cpp:135] unable to deserialize nvm metadata file: no content in file: /root/cachelib_metadata/NvmCacheState
I0403 18:25:41.554360 39787 Device.cpp:1080] Cache file: /dev/nvme0n1 size: 998579896320 truncate: 0
I0403 18:25:41.554422 39787 Device.cpp:965] Created device with num_devices 1 size 998579896320 block_size 4096,stripe_size 0 max_write_size 1048576 max_i
o_size 1048576 io_engine io_uring qdepth 1,num_fdp_devices 0
I0403 18:25:41.617358 39787 NavySetup.cpp:243] metadataSize: 4992897024
I0403 18:25:41.617374 39787 NavySetup.cpp:245] Setting up engine pair 0
I0403 18:25:41.617384 39787 NavySetup.cpp:111] bighashStartingLimit: 4992897024 bigHashCacheOffset: 958636703744 bigHashCacheSize: 39943192576
I0403 18:25:41.617389 39787 NavySetup.cpp:259] blockCacheSize 953643806720

and I should have FDP enabled here

"enableFDP": true,
    "navyEnableIoUring": true,
    "navyQDepth": 1,

@jaesoo-fb
Copy link
Contributor

@jmhands Actually, the name of the config is deviceEnableFDP

@jmhands
Copy link
Author

jmhands commented Apr 3, 2024

@jmhands Actually, the name of the config is deviceEnableFDP

getting closer! That worked for enabling FDP but I'm getting an abort after the 30 seconds of runtime I specified

===JSON Config===
{
  "cache_config":
  {
    "cacheSizeMB": 43000,
    "cacheDir": "/root/cachelib_metadata",
    "allocFactor": 1.08,
    "maxAllocSize": 524288,
    "minAllocSize": 64,
    "navyReaderThreads": 72,
    "navyWriterThreads": 36,
    "nvmCachePaths": ["/dev/nvme0n1"],
    "nvmCacheSizeMB" : 952320,
    "writeAmpDeviceList": ["nvme0n1"],
    "navyBigHashBucketSize": 4096,
    "navyBigHashSizePct": 4,
    "navySmallItemMaxSize": 640,
    "navySegmentedFifoSegmentRatio": [1.0],
    "navyHitsReinsertionThreshold": 1,
    "navyBlockSize": 4096,
    "nvmAdmissionRetentionTimeThreshold": 7200,
    "navyParcelMemoryMB": 6048,
    "enableChainedItem": true,
    "deviceEnableFDP": true,
    "navyEnableIoUring": true,
    "navyQDepth": 1,
    "htBucketPower": 29,
    "moveOnSlabRelease": false,
    "poolRebalanceIntervalSec": 2,
    "poolResizeIntervalSec": 2,
    "rebalanceStrategy": "hits"
  },
  "test_config":
  {
    "opRatePerSec": 550000,
    "opRateBurstSize": 200,
    "enableLookaside": false,
    "generator": "replay",
    "replayGeneratorConfig":
    {
        "ampFactor": 100
    },
    "repeatTraceReplay": true,
    "repeatOpCount" : true,
    "onlySetIfMiss" : false,
    "numOps": 100000000000,
    "numThreads": 10,
    "prepopulateCache": true,
    "traceFileNames": [
	    "kvcache_traces_1.csv",
	    "kvcache_traces_2.csv",
	    "kvcache_traces_3.csv",
	    "kvcache_traces_4.csv",
	    "kvcache_traces_5.csv"
    ]
  }
}

Welcome to OSS version of cachebench
I0403 20:17:21.918377 41022 KVReplayGenerator.h:106] Started KVReplayGenerator (amp factor 100, # of stressor threads 10)
I0403 20:17:21.918377 41023 ReplayGeneratorBase.h:218] [0] Opened trace file kvcache_traces_1.csv
I0403 20:17:21.918590 41023 ReplayGeneratorBase.h:179] New header detected: header "op_time,key,key_size,op,op_count,size,cache_hits,ttl" field map key -> 1, op -> 3, size -> 5, op_count -> 4, key_size -> 2, ttl -> 7, op_time -> 0, cache_hits -> 6
E0403 20:17:22.163842 41022 Cache-inl.h:27] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number fadu echo e1.s 3.84tb
I0403 20:17:22.164012 41022 Cache-inl.h:151] Configuring NVM cache: simple file /dev/nvme0n1 size 952320 MB
I0403 20:17:22.164229 41022 Cache-inl.h:240] Using the following nvm config{
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::QDepth": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionPolicy": "",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionProbBaseSize": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionProbFactorLowerBound": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionProbFactorUpperBound": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionProbability": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionSuffixLen": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::admissionWriteRate": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::bigHashBucketBfSize": "8",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::bigHashBucketSize": "4096",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::bigHashSizePct": "4",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::bigHashSmallItemMaxSize": "640",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheCleanRegionThreads": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheCleanRegions": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheDataChecksum": "true",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheLru": "false",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheNumInMemBuffers": "2",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheRegionSize": "16777216",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheReinsertionHitsThreshold": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheReinsertionPctThreshold": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockCacheSegmentedFifoSegmentRatio": "",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::blockSize": "4096",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::deviceMaxWriteSize": "1048576",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::deviceMetadataSize": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::enableFDP": "1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::fileName": "/dev/nvme0n1",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::fileSize": "998579896320",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::ioEngine": "io_uring",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxConcurrentInserts": "1000000",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxNumReads": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxNumWrites": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxParcelMemoryMB": "6048",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::maxWriteRate": "0",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::navyReqOrderingShards": "21",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::raidPaths": "",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::readerThreads": "72",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::stackSize": "16384",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::truncateFile": "false",
I0403 20:17:22.164229 41022 Cache-inl.h:240]   "navyConfig::writerThreads": "36"
I0403 20:17:22.164229 41022 Cache-inl.h:240] }
I0403 20:17:22.164481 41022 Cache-inl.h:279] Failed to attach for reason: Unable to find any segment with name shm_info
E0403 20:17:23.076967 41022 NvmCacheState.cpp:135] unable to deserialize nvm metadata file: no content in file: /root/cachelib_metadata/NvmCacheState
I0403 20:17:23.081306 41022 Device.cpp:1080] Cache file: /dev/nvme0n1 size: 998579896320 truncate: 0
I0403 20:17:23.081369 41022 Device.cpp:965] Created device with num_devices 1 size 998579896320 block_size 4096,stripe_size 0 max_write_size 1048576 max_io_size 1048576 io_engine io_uring qdepth 1,num_fdp_devices 0
I0403 20:17:23.144655 41022 NavySetup.cpp:243] metadataSize: 4992897024
I0403 20:17:23.144669 41022 NavySetup.cpp:245] Setting up engine pair 0
I0403 20:17:23.144679 41022 NavySetup.cpp:111] bighashStartingLimit: 4992897024 bigHashCacheOffset: 958636703744 bigHashCacheSize: 39943192576
I0403 20:17:23.144685 41022 NavySetup.cpp:259] blockCacheSize 953643806720
I0403 20:17:23.144690 41022 NavySetup.cpp:156] blockcache: starting offset: 4992897024, block cache size: 953633734656
I0403 20:17:23.144709 41022 FifoPolicy.cpp:37] FIFO policy
I0403 20:17:23.170003 41022 BigHash.cpp:93] BigHash created: buckets: 9751756, bucket size: 4096, base offset: 958636703744
I0403 20:17:23.170014 41022 BigHash.cpp:102] Reset BigHash
I0403 20:17:23.178391 41022 RegionManager.cpp:50] 56841 regions, 16777216 bytes each
I0403 20:17:23.190929 41138 RegionManager.cpp:68] region_manager_0 started
I0403 20:17:23.210001 41022 Allocator.cpp:39] Enable priority-based allocation for Allocator. Number of priorities: 1
I0403 20:17:23.210046 41022 BlockCache.cpp:145] Block cache created
I0403 20:17:23.210182 41022 Driver.cpp:70] Max concurrent inserts: 1000000
I0403 20:17:23.210189 41022 Driver.cpp:71] Max parcel memory: 6341787648
I0403 20:17:23.210196 41022 Driver.cpp:72] Use Write Estimated Size: false
I0403 20:17:23.210205 41022 Driver.cpp:209] Reset Navy
I0403 20:17:23.210222 41022 BigHash.cpp:102] Reset BigHash
I0403 20:17:23.214526 41022 BlockCache.cpp:705] Reset block cache
Total 1000000.00M ops to be run
E0403 20:17:23.504747 41143 Cache-inl.h:27] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number fadu echo e1.s 3.84tb
20:17:23       0.00M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)
I0403 20:17:53.281225 41141 main.cpp:92] Stopping due to timeout 30 seconds
E0403 20:17:53.534372 41022 Cache-inl.h:27] Exception fetching nand writes for nvme0n1. Msg: Vendor not recogized in device model number fadu echo e1.s 3.84tb
== Test Results ==
== Allocator Stats ==
Items in RAM  : 1,111,979
Items in NVM  : 0
Alloc Attempts: 1,529,370 Success: 100.00%
Evict Attempts: 0 Success: 0.00%
RAM Evictions : 0
Fraction of pool 0 used : 0.04
Cache Gets    : 14,895,979
Hit Ratio     :  12.06%
RAM Hit Ratio :  12.04%
NVM Hit Ratio :   0.02%
RAM eviction rejects expiry : 0
RAM eviction rejects clean : 0
NVM Read  Latency    p50      :       0.00 us
NVM Read  Latency    p90      :       0.00 us
NVM Read  Latency    p99      :       0.00 us
NVM Read  Latency    p999     :       0.00 us
NVM Read  Latency    p9999    :       0.00 us
NVM Read  Latency    p99999   :       0.00 us
NVM Read  Latency    p999999  :       0.00 us
NVM Read  Latency    p100     :       0.00 us
NVM Write Latency    p50      :       0.00 us
NVM Write Latency    p90      :       0.00 us
NVM Write Latency    p99      :       0.00 us
NVM Write Latency    p999     :       0.00 us
NVM Write Latency    p9999    :       0.00 us
NVM Write Latency    p99999   :       0.00 us
NVM Write Latency    p999999  :       0.00 us
NVM Write Latency    p100     :       0.00 us
NVM bytes written (physical)  :   0.00 GB
NVM bytes written (logical)   :   0.00 GB
NVM bytes written (nand)      :   0.00 GB
NVM app write amplification   :   0.00
NVM dev write amplification   :   0.00
NVM Gets      :      13,102,692, Coalesced :   0.00%
NVM Puts      :               0, Success   : 100.00%, Clean   :   0.00%, AbortsFromDel   :        0, AbortsFromGet   :        0
NVM Evicts    :               0, Clean     :   0.00%, Unclean :       0, Double          :        0
NVM Deletes   :       1,253,922 Skipped Deletes: 100.00%

== Throughput for  ==
Total Ops : 16.52 million
Total sets: 1,529,370
get       :   496,024/s, success   :  12.04%
couldExist:         0/s, success   :   0.00%
set       :    50,926/s, success   : 100.00%
del       :     3,204/s, found     :   2.70%

== KVReplayGenerator Stats ==
Total Processed Samples: 0.08 million (parse error: 0)

I0403 20:17:53.534984 41022 BigHash.cpp:514] Flush big hash
I0403 20:17:53.535011 41022 BlockCache.cpp:699] Flush block cache
I0403 20:17:53.535034 41022 BlockCache.cpp:793] Starting block cache persist
F0403 20:17:53.654948 41022 Device.cpp:761] Check failed: !useIoUring_ && !(fdpNvmeVec_.size() > 0)
*** Aborted at 1712175473 (Unix time, try 'date -d @1712175473') ***
*** Signal 6 (SIGABRT) (0xa03e) received by PID 41022 (pthread TID 0x7fc82f0424c0) (linux TID 41022) (maybe from PID 41022, UID 0) (code: -6), stack trace: ***
    @ 0000000000d3bd3e folly::symbolizer::(anonymous namespace)::innerSignalHandler(int, siginfo_t*, void*)
                       /home/jm/CacheLib/cachelib/external/folly/folly/experimental/symbolizer/SignalHandler.cpp:449
    @ 0000000000d3be24 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
                       /home/jm/CacheLib/cachelib/external/folly/folly/experimental/symbolizer/SignalHandler.cpp:470
    @ 000000000004251f (unknown)
    @ 00000000000969fc pthread_kill
    @ 0000000000042475 raise
    @ 00000000000287f2 abort
    @ 0000000000f3c23f folly::LogCategory::admitMessage(folly::LogMessage const&) const
                       /home/jm/CacheLib/cachelib/external/folly/folly/logging/LogCategory.cpp:71
    @ 0000000000f5ea4c folly::LogStreamProcessor::logNow()
                       /home/jm/CacheLib/cachelib/external/folly/folly/logging/LogStreamProcessor.cpp:190
    @ 0000000000f5ebbd folly::LogStreamVoidify<true>::operator&(std::ostream&)
                       /home/jm/CacheLib/cachelib/external/folly/folly/logging/LogStreamProcessor.cpp:222
    @ 00000000008aed77 facebook::cachelib::navy::(anonymous namespace)::AsyncIoContext::AsyncIoContext(std::unique_ptr<folly::AsyncBase, std::default_delete<folly::AsyncBase> >&&, unsigned long, folly::EventBase*, unsigned long, bool, std::vector<std::shared_ptr<facebook::cachelib::navy::FdpNvme>, std::allocator<std::shared_ptr<facebook::cachelib::navy::FdpNvme> > >)
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:761
    @ 00000000008b3c08 facebook::cachelib::navy::(anonymous namespace)::FileDevice::getIoContext()
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:1049
    @ 00000000008b30f2 facebook::cachelib::navy::(anonymous namespace)::FileDevice::writeImpl(unsigned long, unsigned int, void const*, int)
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:986
    @ 00000000008aa3cd facebook::cachelib::navy::Device::writeInternal(unsigned long, unsigned char const*, unsigned long, int)
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:424
    @ 00000000008a9c6c facebook::cachelib::navy::Device::write(unsigned long, facebook::cachelib::navy::Buffer, int)
                       /home/jm/CacheLib/cachelib/navy/common/Device.cpp:408
    @ 00000000009d478a facebook::cachelib::navy::(anonymous namespace)::DeviceMetaDataWriter::writeRecord(std::unique_ptr<folly::IOBuf, std::default_delete<folly::IOBuf> >)::{lambda()#1}::operator()()
                       /home/jm/CacheLib/cachelib/navy/serialization/RecordIO.cpp:114
    @ 00000000009d49e2 facebook::cachelib::navy::(anonymous namespace)::DeviceMetaDataWriter::writeRecord(std::unique_ptr<folly::IOBuf, std::default_delete<folly::IOBuf> >)
                       /home/jm/CacheLib/cachelib/navy/serialization/RecordIO.cpp:126
    @ 00000000009ad7a0 void facebook::cachelib::serializeProto<facebook::cachelib::navy::serialization::RegionData, apache::thrift::Serializer<apache::thrift::BinaryProtocolReader, apache::thrift::BinaryProtocolWriter> >(facebook::cachelib::navy::serialization::RegionData const&, facebook::cachelib::RecordWriter&)
                       /home/jm/CacheLib/cachelib/../cachelib/common/Serialization.h:191
                       -> /home/jm/CacheLib/cachelib/navy/block_cache/RegionManager.cpp
    @ 00000000009ab896 void facebook::cachelib::navy::serializeProto<facebook::cachelib::navy::serialization::RegionData>(facebook::cachelib::navy::serialization::RegionData const&, facebook::cachelib::RecordWriter&)
                       /home/jm/CacheLib/cachelib/../cachelib/navy/serialization/Serialization.h:32
                       -> /home/jm/CacheLib/cachelib/navy/block_cache/RegionManager.cpp
    @ 00000000009a3dc0 facebook::cachelib::navy::RegionManager::persist(facebook::cachelib::RecordWriter&) const
                       /home/jm/CacheLib/cachelib/navy/block_cache/RegionManager.cpp:466
    @ 000000000097f3f2 facebook::cachelib::navy::BlockCache::persist(facebook::cachelib::RecordWriter&)
                       /home/jm/CacheLib/cachelib/navy/block_cache/BlockCache.cpp:801
    @ 00000000009ceda3 facebook::cachelib::navy::EnginePair::persist(facebook::cachelib::RecordWriter&) const
                       /home/jm/CacheLib/cachelib/navy/engine/EnginePair.cpp:255
    @ 00000000009c9741 facebook::cachelib::navy::Driver::persist() const
                       /home/jm/CacheLib/cachelib/navy/driver/Driver.cpp:223
    @ 00000000003cc5f8 facebook::cachelib::NvmCache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::shutDown()
                       /home/jm/CacheLib/cachelib/../cachelib/allocator/nvmcache/NvmCache-inl.h:883
                       -> /home/jm/CacheLib/cachelib/allocator/CacheAllocator.cpp
    @ 00000000003161a9 facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::saveNvmCache()
                       /home/jm/CacheLib/cachelib/../cachelib/allocator/CacheAllocator-inl.h:3046
                       -> /home/jm/CacheLib/cachelib/allocator/CacheAllocator.cpp
    @ 00000000002fc2b3 facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::shutDown()
                       /home/jm/CacheLib/cachelib/../cachelib/allocator/CacheAllocator-inl.h:3007
                       -> /home/jm/CacheLib/cachelib/allocator/CacheAllocator.cpp
    @ 00000000001d7e1e facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::~Cache()
                       /home/jm/CacheLib/cachelib/../cachelib/cachebench/cache/Cache-inl.h:319
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 00000000001c039d std::default_delete<facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> > >::operator()(facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >*) const
                       /usr/include/c++/11/bits/unique_ptr.h:85
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 00000000001aff8d std::unique_ptr<facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >, std::default_delete<facebook::cachelib::cachebench::Cache<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> > > >::~unique_ptr()
                       /usr/include/c++/11/bits/unique_ptr.h:361
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 00000000001b3a41 facebook::cachelib::cachebench::CacheStressor<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::~CacheStressor()
                       /home/jm/CacheLib/cachelib/../cachelib/cachebench/runner/CacheStressor.h:134
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 00000000001b3ac5 facebook::cachelib::cachebench::CacheStressor<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >::~CacheStressor()
                       /home/jm/CacheLib/cachelib/../cachelib/cachebench/runner/CacheStressor.h:134
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Stressor.cpp
    @ 0000000000158ff9 std::default_delete<facebook::cachelib::cachebench::Stressor>::operator()(facebook::cachelib::cachebench::Stressor*) const
                       /usr/include/c++/11/bits/unique_ptr.h:85
                       -> /home/jm/CacheLib/cachelib/cachebench/main.cpp
    @ 00000000001638ad std::__uniq_ptr_impl<facebook::cachelib::cachebench::Stressor, std::default_delete<facebook::cachelib::cachebench::Stressor> >::reset(facebook::cachelib::cachebench::Stressor*)
                       /usr/include/c++/11/bits/unique_ptr.h:182
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Runner.cpp
    @ 00000000001619ee std::unique_ptr<facebook::cachelib::cachebench::Stressor, std::default_delete<facebook::cachelib::cachebench::Stressor> >::reset(facebook::cachelib::cachebench::Stressor*)
                       /usr/include/c++/11/bits/unique_ptr.h:456
                       -> /home/jm/CacheLib/cachelib/cachebench/runner/Runner.cpp
    @ 000000000015a902 facebook::cachelib::cachebench::Runner::run(std::chrono::duration<long, std::ratio<1l, 1l> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
                       /home/jm/CacheLib/cachelib/cachebench/runner/Runner.cpp:54
    @ 000000000014f5ea main
                       /home/jm/CacheLib/cachelib/cachebench/main.cpp:159
    @ 0000000000029d8f (unknown)
    @ 0000000000029e3f __libc_start_main
    @ 000000000014e764 _start
Aborted```

@arungeorge83
Copy link

It is a case liburing library (which the FDP support is dependent on for now) is not available on the system, and the cachelib build system chose not to install it for some reason.
Try installing the liburing by 'yum install liburing' or by downloading from https://github.com/axboe/liburing.
Btw, which kernel version are you using? The iouring passthru support needs kernel of 6.1.x at least.

@jmhands
Copy link
Author

jmhands commented Apr 5, 2024

I'm using Ubuntu 22.04.4 LTS with HWE kernel, 6.5.0-26-generic. io_uring works fine in fio, etc.
installing sudo apt install liburing-dev or sudo apt install liburing2 don't help

@jaesoo-fb
Copy link
Contributor

@jmhands sudo apt-get install liburing-dev should have enabled the io_uring and CACHELIB_IOURING_DISABLE should not be defined here. FYI, this is the cmake rule to find the required io_uring libraries and headers.

Please take a look at the cmake output for cachelib for any issues.

@jmhands
Copy link
Author

jmhands commented Apr 5, 2024

after installing liburing-dev and running the ./contrib/build.sh -d -j -v it fails at compiling Folly

      |                      IORING_SETUP_CQSIZE
make[2]: *** [CMakeFiles/folly_base.dir/build.make:1714: CMakeFiles/folly_base.dir/folly/experimental/io/IoUring.cpp.o] Error 1
make[2]: *** [CMakeFiles/folly_base.dir/build.make:1770: CMakeFiles/folly_base.dir/folly/experimental/io/IoUringProvidedBufferRing.cpp.o] Error 1
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp: In member function ‘void folly::IoUringBackend::initSubmissionLinked()’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:20: error: ISO C++ forbids declaration of ‘IoUringProvidedBufferRing’ with no type [-fpermissive]
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:45: error: expected ‘)’ before ‘::’ token
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |             ~                               ^~
      |                                             )
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:45: error: expected ‘{’ before ‘::’ token
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |                                             ^~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:47: error: ‘::LibUringCallError’ has not been declared
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |                                               ^~~~~~~~~~~~~~~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1103:66: error: ‘ex’ was not declared in this scope; did you mean ‘exp’?
 1103 |     } catch (const IoUringProvidedBufferRing::LibUringCallError& ex) {
      |                                                                  ^~
      |                                                                  exp
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp: In member function ‘void folly::IoUringBackend::cancel(folly::IoSqeBase*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1396:5: error: ‘::io_uring_prep_cancel64’ has not been declared; did you mean ‘io_uring_prep_cancel’?
 1396 |   ::io_uring_prep_cancel64(sqe, (uint64_t)ioSqe, 0);
      |     ^~~~~~~~~~~~~~~~~~~~~~
      |     io_uring_prep_cancel
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp: In member function ‘int folly::IoUringBackend::submitBusyCheck(int, folly::IoUringBackend::WaitForEventsMode)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:1617:19: error: ‘::io_uring_submit_and_wait_timeout’ has not been declared; did you mean ‘io_uring_submit_and_wait’?
 1617 |           res = ::io_uring_submit_and_wait_timeout(
      |                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                   io_uring_submit_and_wait
In file included from /home/jm/CacheLib/cachelib/external/folly/folly/GLog.h:24,
                 from /home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:21:
/home/jm/CacheLib/opt/cachelib/include/glog/logging.h: In instantiation of ‘void google::MakeCheckOpValueString(std::ostream*, const T&) [with T = void(void*); std::ostream = std::basic_ostream<char>]’:
/home/jm/CacheLib/opt/cachelib/include/glog/logging.h:786:25:   required from ‘std::string* google::MakeCheckOpString(const T1&, const T2&, const char*) [with T1 = void (*)(void*); T2 = void(void*); std::string = std::__cxx11::basic_string<char>]’
/home/jm/CacheLib/opt/cachelib/include/glog/logging.h:809:1:   required from ‘std::string* google::Check_EQImpl(const T1&, const T2&, const char*) [with T1 = void (*)(void*); T2 = void(void*); std::string = std::__cxx11::basic_string<char>]’
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:805:5:   required from here
/home/jm/CacheLib/opt/cachelib/include/glog/logging.h:723:9: warning: the compiler can assume that the address of ‘v’ will never be NULL [-Waddress]
  723 |   (*os) << v;
      |   ~~~~~~^~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In member function ‘virtual void folly::AsyncIoUringSocket::ReadSqe::processSubmit(io_uring_sqe*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:714:26: error: ‘IORING_RECV_MULTISHOT’ was not declared in this scope
  714 |           ioprio_flags = IORING_RECV_MULTISHOT;
      |                          ^~~~~~~~~~~~~~~~~~~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp: In member function ‘void folly::{anonymous}::SignalRegistry::notify(int)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/IoUringBackend.cpp:112:14: warning: ignoring return value of ‘ssize_t write(int, const void*, size_t)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  112 |       ::write(fd, &sigNum, 1);
      |       ~~~~~~~^~~~~~~~~~~~~~~~
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In member function ‘virtual void folly::AsyncIoUringSocket::WriteSqe::processSubmit(io_uring_sqe*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:872:7: error: ‘::io_uring_prep_sendmsg_zc’ has not been declared; did you mean ‘io_uring_prep_sendmsg’?
  872 |     ::io_uring_prep_sendmsg_zc(
      |       ^~~~~~~~~~~~~~~~~~~~~~~~
      |       io_uring_prep_sendmsg
make[2]: *** [CMakeFiles/folly_base.dir/build.make:1728: CMakeFiles/folly_base.dir/folly/experimental/io/IoUringBackend.cpp.o] Error 1
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In lambda function:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:1277:17: error: ‘IORING_CQE_F_NOTIF’ was not declared in this scope; did you mean ‘IORING_CQE_F_MORE’?
 1277 |     if (flags & IORING_CQE_F_NOTIF) {
      |                 ^~~~~~~~~~~~~~~~~~
      |                 IORING_CQE_F_MORE
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In member function ‘virtual void folly::AsyncIoUringSocket::WriteSqe::callbackCancelled(const io_uring_cqe*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:1296:38: error: ‘IORING_CQE_F_NOTIF’ was not declared in this scope; did you mean ‘IORING_CQE_F_MORE’?
 1296 |           << " notif=" << !!(flags & IORING_CQE_F_NOTIF);
      |                                      ^~~~~~~~~~~~~~~~~~
      |                                      IORING_CQE_F_MORE
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp: In member function ‘virtual void folly::AsyncIoUringSocket::WriteSqe::callback(const io_uring_cqe*)’:
/home/jm/CacheLib/cachelib/external/folly/folly/experimental/io/AsyncIoUringSocket.cpp:1313:38: error: ‘IORING_CQE_F_NOTIF’ was not declared in this scope; did you mean ‘IORING_CQE_F_MORE’?
 1313 |           << " notif=" << !!(flags & IORING_CQE_F_NOTIF)
      |                                      ^~~~~~~~~~~~~~~~~~
      |                                      IORING_CQE_F_MORE
make[2]: *** [CMakeFiles/folly_base.dir/build.make:1644: CMakeFiles/folly_base.dir/folly/experimental/io/AsyncIoUringSocket.cpp.o] Error 1
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make[1]: *** [CMakeFiles/Makefile2:145: CMakeFiles/folly_base.dir/all] Error 2
make[1]: Leaving directory '/home/jm/CacheLib/build-folly'
make: *** [Makefile:136: all] Error 2
build-package.sh: error: make failed
build.sh: error: failed to build dependency 'folly'```

@arungeorge83
Copy link

Building and installing iouring from source looks to be working.

The following is a method which works for FDP.

git clone https://github.com/axboe/liburing.git
-- Follow build process in the Readme - configure & make & make install

git clone https://github.com/facebook/CacheLib.git
git checkout tags/v20240320_stable
--- Build. Ex: sudo ./contrib/build.sh -j -v –d

@jmhands
Copy link
Author

jmhands commented Apr 8, 2024

After building liburing from source it still fails at Folly. I can get it to go farther by linking sudo ln -sf /usr/lib/x86_64-linux-gnu/liburing.so.2 /usr/lib/x86_64-linux-gnu/liburing.so but there is still something weird

jm@z690ace:~/liburing$ sudo make install
sed -e "s%@prefix@%/usr%g" \
    -e "s%@libdir@%/usr/lib%g" \
    -e "s%@includedir@%/usr/include%g" \
    -e "s%@NAME@%liburing%g" \
    -e "s%@VERSION@%2.6%g" \
    liburing.pc.in >liburing.pc
sed -e "s%@prefix@%/usr%g" \
    -e "s%@libdir@%/usr/lib%g" \
    -e "s%@includedir@%/usr/include%g" \
    -e "s%@NAME@%liburing%g" \
    -e "s%@VERSION@%2.6%g" \
    liburing-ffi.pc.in >liburing-ffi.pc
make[1]: Entering directory '/home/jm/liburing/src'
install -D -m 644 include/liburing/io_uring.h /usr/include/liburing/io_uring.h
install -D -m 644 include/liburing.h /usr/include/liburing.h
install -D -m 644 include/liburing/compat.h /usr/include/liburing/compat.h
install -D -m 644 include/liburing/barrier.h /usr/include/liburing/barrier.h
install -D -m 644 include/liburing/io_uring_version.h /usr/include/liburing/io_uring_version.h
install -D -m 644 liburing.a /usr/lib/liburing.a
install -D -m 644 liburing-ffi.a /usr/lib/liburing-ffi.a
install -D -m 755 liburing.so.2.6 /usr/lib/liburing.so.2.6
install -D -m 755 liburing-ffi.so.2.6 /usr/lib/liburing-ffi.so.2.6
ln -sf liburing.so.2.6 /usr/lib/liburing.so.2
ln -sf liburing.so.2.6 /usr/lib/liburing.so
ln -sf liburing-ffi.so.2.6 /usr/lib/liburing-ffi.so.2
ln -sf liburing-ffi.so.2.6 /usr/lib/liburing-ffi.so
make[1]: Leaving directory '/home/jm/liburing/src'
install -D -m 644 liburing.pc /usr/lib/pkgconfig/liburing.pc
install -D -m 644 liburing-ffi.pc /usr/lib/pkgconfig/liburing-ffi.pc
install -m 755 -d /usr/man/man2
install -m 644 man/*.2 /usr/man/man2
install -m 755 -d /usr/man/man3
install -m 644 man/*.3 /usr/man/man3
install -m 755 -d /usr/man/man7
install -m 644 man/*.7 /usr/man/man7
jm@z690ace:~/liburing$ locate liburing.so
/home/jm/liburing/src/liburing.so.2.6
/snap/lxd/27037/lib/liburing.so
/snap/lxd/27037/lib/liburing.so.2
/snap/lxd/27037/lib/liburing.so.2.5
/snap/lxd/27948/lib/liburing.so
/snap/lxd/27948/lib/liburing.so.2
/snap/lxd/27948/lib/liburing.so.2.5
/usr/lib/liburing.so
/usr/lib/liburing.so.2
/usr/lib/liburing.so.2.6
/usr/lib/x86_64-linux-gnu/liburing.so.2
/usr/lib/x86_64-linux-gnu/liburing.so.2.1.0



jm@z690ace:~/CacheLib$ sudo ./contrib/build.sh -j -v -d

fails here
...
[ 94%] Built target folly_base
make  -f CMakeFiles/folly.dir/build.make CMakeFiles/folly.dir/depend
make[2]: Entering directory '/home/jm/CacheLib/build-folly'
cd /home/jm/CacheLib/build-folly && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jm/CacheLib/cachelib/external/folly /home/jm/CacheLib/cachelib/external/folly /home/jm/CacheLib/build-folly /home/jm/CacheLib/build-folly /home/jm/CacheLib/build-folly/CMakeFiles/folly.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make  -f CMakeFiles/folly.dir/build.make CMakeFiles/folly.dir/build
make[2]: Entering directory '/home/jm/CacheLib/build-folly'
make[2]: *** No rule to make target '/usr/lib/x86_64-linux-gnu/liburing.so', needed by 'libfolly.so.0.58.0-dev'.  Stop.
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make[1]: *** [CMakeFiles/Makefile2:171: CMakeFiles/folly.dir/all] Error 2
make[1]: Leaving directory '/home/jm/CacheLib/build-folly'
make: *** [Makefile:136: all] Error 2
build-package.sh: error: make failed
build.sh: error: failed to build dependency 'folly'

after linking

[ 97%] Built target follybenchmark
cd /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer && /usr/bin/cmake -E cmake_symlink_library libfolly_exception_tracer.so.0.58.0-dev libfolly_exception_tracer.so.0.58.0-dev libfolly_exception_tracer.so
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_unregister_buf_ring'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_register_buf_ring'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_get_events'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_register_ring_fd'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_setup'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_register'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_submit_and_get_events'
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
/usr/bin/ld: ../../../libfolly.so.0.58.0-dev: undefined reference to `io_uring_submit_and_wait_timeout'
[ 98%] Built target folly_exception_tracer
make  -f folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/build.make folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/depend
make[2]: Entering directory '/home/jm/CacheLib/build-folly'
cd /home/jm/CacheLib/build-folly && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/jm/CacheLib/cachelib/external/folly /home/jm/CacheLib/cachelib/external/folly/folly/experimental/exception_tracer /home/jm/CacheLib/build-folly /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/DependInfo.cmake --color=
collect2: error: ld returned 1 exit status
make[2]: *** [folly/logging/example/CMakeFiles/logging_example.dir/build.make:125: folly/logging/example/logging_example] Error 1
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make[1]: *** [CMakeFiles/Makefile2:331: folly/logging/example/CMakeFiles/logging_example.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
Dependencies file "folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/ExceptionCounterLib.cpp.o.d" is newer than depends file "/home/jm/CacheLib/build-folly/folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/compiler_depend.internal".
Consolidate compiler generated dependencies of target folly_exception_counter
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
make  -f folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/build.make folly/experimental/exception_tracer/CMakeFiles/folly_exception_counter.dir/build
make[2]: Entering directory '/home/jm/CacheLib/build-folly'
[ 98%] Linking CXX shared library libfolly_exception_counter.so
cd /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer && /usr/bin/cmake -E cmake_link_script CMakeFiles/folly_exception_counter.dir/link.txt --verbose=YES
/usr/bin/c++ -fPIC -g -g -Wall -Wextra -shared -Wl,-soname,libfolly_exception_counter.so.0.58.0-dev -o libfolly_exception_counter.so.0.58.0-dev CMakeFiles/folly_exception_counter.dir/ExceptionCounterLib.cpp.o  -Wl,-rpath,/home/jm/CacheLib/build-folly/folly/experimental/exception_tracer:/home/jm/CacheLib/build-folly:/home/jm/CacheLib/opt/cachelib/lib: libfolly_exception_tracer.so.0.58.0-dev libfolly_exception_tracer_base.so.0.58.0-dev ../../../libfolly.so.0.58.0-dev /home/jm/CacheLib/opt/cachelib/lib/libfmtd.so.10.2.1 /usr/lib/x86_64-linux-gnu/libboost_context.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_filesystem.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_program_options.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_regex.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_system.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.74.0 /usr/lib/x86_64-linux-gnu/libboost_atomic.so.1.74.0 -ldouble-conversion /home/jm/CacheLib/opt/cachelib/lib/libgflags_debug.so.2.2.2 /home/jm/CacheLib/opt/cachelib/lib/libglogd.so -levent -lz -lssl -lcrypto -lbz2 -llzma -llz4 /home/jm/CacheLib/opt/cachelib/lib/libzstd.so -lsnappy -ldwarf -Wl,-Bstatic -liberty -Wl,-Bdynamic -laio -luring -lsodium -ldl -lunwind
cd /home/jm/CacheLib/build-folly/folly/experimental/exception_tracer && /usr/bin/cmake -E cmake_symlink_library libfolly_exception_counter.so.0.58.0-dev libfolly_exception_counter.so.0.58.0-dev libfolly_exception_counter.so
make[2]: Leaving directory '/home/jm/CacheLib/build-folly'
[ 99%] Built target folly_exception_counter
make[1]: Leaving directory '/home/jm/CacheLib/build-folly'
make: *** [Makefile:136: all] Error 2
build-package.sh: error: make failed
build.sh: error: failed to build dependency 'folly'

@jaesoo-fb
Copy link
Contributor

@jmhands Is this error occurring even with clean build after removing liburing-dev package?

Yeah, those flags like IORING_CQE_F_NOTIF and IORING_RECV_MULTISHOT seems to be added in liburing-2.3 and kernel v6.0. Not sure what version of liburing-dev is installed by default in Ubuntu 22.04.4 LTS, but it would make sense to build/install from source.

Doesn't liburing has debian build as well? https://github.com/axboe/liburing/blob/master/make-debs.sh

@jmhands
Copy link
Author

jmhands commented Apr 17, 2024

I was able to get it working with the following steps

  1. clean Ubuntu 22.04.4 install with HWE kernel 6.5
  2. sudo apt remove liburing2
  3. sudo apt install build-essential
git clone https://github.com/axboe/liburing.git
cd liburing
./configure --cc=gcc --cxx=g++;
make -j$(nproc);
sudo make install;
git clone https://github.com/facebook/CacheLib
cd CacheLib
./contrib/build.sh -d -j -v

this builds correctly but then I get an error when I run cachebench

bin/cachebench: /lib/x86_64-linux-gnu/liburing.so.2: version `LIBURING_2.2' not found (required by /home/jm/CacheLib/opt/cachelib/bin/../lib/libfolly.so.0.58.0-dev)
bin/cachebench: /lib/x86_64-linux-gnu/liburing.so.2: version `LIBURING_2.3' not found (required by /home/jm/CacheLib/opt/cachelib/bin/../lib/libfolly.so.0.58.0-dev)

but was able to resolve with

ls -l /usr/lib/x86_64-linux-gnu/liburing.so*
ls -l /usr/lib/liburing.so*

now that cachebench works...
6.
sudo apt install awscli aws s3 cp --no-sign-request --recursive s3://cachelib-workload-sharing/pub/kvcache/202206/ ./

add these into config

      "navyQDepth": 1,
      "navyEnableIoUring": true,
      "deviceEnableFDP": true,
  1. run cachebench with sudo bin/cachebench -json_test_config=test_configs/trace_replay/202206/config_kvcache.json -progress=600 -progress_stats_file=cachebench-progress.log

@FletcherAtFADU
Copy link

Hi, @jaesoo-fb

I have a few questions about testing Cachebench after enabling FDP.

1. WAF Expectation for KVCache
When running KVCache with FDP enable, It seems that the host allocate only one placement handle for BlockCache(no bighash).
How can FDP take advantage in this scenario compared to Non-FDP?

image

2. [Error] We saw the IO Error Issue in the log, but the test didn't fail.
It seems like an IO Error shouldn't occur when looking at the code, but I don't quite understand it.

image

I hope below code is working well, this IO error never happen.
Am I missing anything?

image
image

3. [Fatal Error] The test failed due to an out of range issue.
It seems like the size I issued isn't being counted properly, similar to the IO Error issue.

It seems that the Write does not increase the region.getLastEntryEndOffset() by Size.
Could you please check if there's any issue with the code in this part as well?
(I encountered this issue when running KVCache with FDP enabled deivce (RUH 1))

image

@arungeorge83
Copy link

arungeorge83 commented May 10, 2024

Hi @FletcherAtFADU ,

  1. WAF Expectation for KVCache

You might have selected a kvcache workload without BH enabled. Could you please check the "navyBigHashSizePct" in the config.json file of the workload selected.

  1. [Error] We saw the IO Error Issue in the log, but the test didn't fail.
  2. [Fatal Error] The test failed due to an out of range issue.

Could you check the "nvmCacheSizeMB" with your device size. Looks like "nvmCacheSizeMB" might be going above the NVMe NS/partition that you have chosen.

Could you attach the config.json and initialization/run logs of the cachebench. That would help to analyze it better.

  • Arun

@FletcherAtFADU
Copy link

Hi, @arungeorge83
Thank you for giving the information.!
Here are the logs and few queries about your answers.

  • You might have selected a kvcache workload without BH enabled. Could you please check the "navyBigHashSizePct" in the config.json file of the workload selected.

=> I've checked that the default setting of "navyBigHashSizePct" is 0.
In KVCache, to check the difference in WAF between FDP and Non-FDP, I think 'navyBigHashSizePct' must absolutely not be zero! Is this correct?

  • Could you check the "nvmCacheSizeMB" with your device size. Looks like "nvmCacheSizeMB" might be going above the NVMe NS/partition that you have chosen.

=>nvmCacheSizeMB set 932000 (MB), but the device capacity is 1.25TB (1250602278912 Bytes) which is over than nvmCacheSizeMB.

stats_240508_113152.log
output_240508_113152 (1).log

@arungeorge83
Copy link

@FletcherAtFADU

I think 'navyBigHashSizePct' must absolutely not be zero! Is this correct?

Yes, it should be non-zero for BigHash enabled cases. (I see that you have used test_configs/ssd_perf/kvcache_l2_wc/ which does not have bighash enabled). Please use the production traces mentioned at https://cachelib.org/docs/Cache_Library_User_Guides/Cachebench_FB_HW_eval#running-cachebench-with-the-trace-workload for FDP experiments. Or you can use test_configs/ssd_perf/flat_kvcache_reg after changing the device from /dev/md0 to /dev/nvme0n1.

The IO errors looks interesting.
Does the non-FDP mode works fine?
And FDP mode uses iouring-passthru. Could you run some of the https://github.com/axboe/liburing/tree/master/examples and see if iouring path is fine on your device and system.

@FletcherAtFADU
Copy link

@arungeorge83
We've tested the examples you suggested and haven't encountered any issues.
Initially, we set up our environment on CentOS, and we're currently cross-checking to see if there are any issues on Ubuntu.

Additionally, even when an IO Error occurs, the tests continue to run.
We are currently enabling BigHash in the KVCache and comparing FDP versus Non-FDP.
I'll check the results and get back to you if there are any problems.

@gaowayne
Copy link

@arungeorge83 buddy, could you please share your CacheLib test config for FDP SSD?

@arungeorge83
Copy link

@gaowayne Please find the sample FDP config used with kvcache production traces.
"cache_config":
{
"cacheSizeMB": 20000,
"cacheDir": "/root/cachelib_metadata-1",
"allocFactor": 1.08,
"maxAllocSize": 524288,
"minAllocSize": 64,
"navyReaderThreads": 72,
"navyWriterThreads": 36,
"nvmCachePaths": ["/dev/nvme0n1"],
"nvmCacheSizeMB" : 878700,
"writeAmpDeviceList": ["nvme0n1"],
"navyBigHashBucketSize": 4096,
"navyBigHashSizePct": 4,
"navySmallItemMaxSize": 640,
"navySegmentedFifoSegmentRatio": [1.0],
"navyHitsReinsertionThreshold": 1,
"navyBlockSize": 4096,
"deviceMaxWriteSize": 262144,
"nvmAdmissionRetentionTimeThreshold": 7200,
"navyParcelMemoryMB": 6048,
"enableChainedItem": true,
"htBucketPower": 29,
"navyQDepth": 1,
"navyEnableIoUring": true,
"deviceEnableFDP": true,
"moveOnSlabRelease": false,
"poolRebalanceIntervalSec": 2,
"poolResizeIntervalSec": 2,
"rebalanceStrategy": "hits"
},
"test_config":
{
"opRatePerSec": 1000000,
"opRateBurstSize": 200,
"enableLookaside": false,
"generator": "replay",
"replayGeneratorConfig":
{
"ampFactor": 200
},
"repeatTraceReplay": true,
"repeatOpCount" : true,
"onlySetIfMiss" : false,
"numOps": 100000000000,
"numThreads": 10,
"prepopulateCache": true,
"traceFileNames": [
"kvcache_traces_1.csv",
"kvcache_traces_2.csv",
"kvcache_traces_3.csv",
"kvcache_traces_4.csv",
"kvcache_traces_5.csv"
]
}
}

@FletcherAtFADU
Copy link

@arungeorge83 @gaowayne
Thanks to your help, we achieved proper evaluation results after enabling FDP.
Using the KV Trace evaluation with arungeorge83's provided config worked perfectly.
image

There is still one thing that I do not understand.
We faced a fatal error due to an out-of-range issue, which was resolved by reducing the nvmCacheSizeMB.
However, we are using a device with a size of 1.25TB (1250602278912 Bytes).!!
The default "nvmCacheSizeMB" of 932000 works well in non-FDP mode but causes errors in FDP mode.
However, the new config with "nvmCacheSizeMB" set to 878700 worked fine in FDP mode.!!

1 More Question:
If the device support up to 8 RUs per NS and multi namespace as well.
Can we adjust the parameters to allocate more RUs per NS or set it up to effectively demonstrate the benefits of using multiple namespaces?

@arungeorge83
Copy link

@FletcherAtFADU It is great to know that you are able to re-produce the results.

Can we adjust the parameters to allocate more RUs per NS or set it up to effectively demonstrate the benefits of using multiple namespaces?

The current code does not support that, though a configurable RUH allocation mechanism is in thoughts.

We faced a fatal error due to an out-of-range issue, which was resolved by reducing the nvmCacheSizeMB.

Interesting. We were able to test with the full capacity of the device.
Just curious, is this issue somehow related to the Number of RGs and RU Size of the FDP device that you are using?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants