Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NSFS: migration timestamp is updated unexpectedly #7856

Open
piste-jp-ibm opened this issue Feb 29, 2024 · 2 comments
Open

NSFS: migration timestamp is updated unexpectedly #7856

piste-jp-ibm opened this issue Feb 29, 2024 · 2 comments
Assignees
Labels

Comments

@piste-jp-ibm
Copy link
Contributor

Environment info

  • NooBaa Version: noobaa-core-5.16.0-20240218.el8.x86_64
  • Platform: RPM

Actual behavior

  • migration timestamp logs/timestamp.migrate is updated even if the migration script is not called

Expected behavior

  • migration timestamp logs/timestamp.migrate is updated only when the migration script is called

Steps to reproduce

# for i in `seq 5`; do aws s3api put-object --bucket testbucket01 --key 10M${i} --body ~/testdata/dummy.M.10M --storage-class GLACIER; done
# sleep 900
# echo -n "Mig: "; cat timestamp.migrate; echo; echo -n "Rst: "; cat timestamp.restore; echo
Mig: 2024-02-29T05:39:11.318Z
Rst: 2024-02-29T05:39:31.279Z
# /usr/local/noobaa-core/node/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js glacier migrate
# echo -n "Mig: "; cat timestamp.migrate; echo; echo -n "Rst: "; cat timestamp.restore; echo
Mig: 2024-02-29T05:56:05.263Z
Rst: 2024-02-29T05:39:31.279Z
# aws s3api restore-object --bucket testbucket01 --key 10M1 --restore-request '{"Days":1}'
# sleep 900
# /usr/local/noobaa-core/node/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js glacier restore
# echo -n "Mig: "; cat timestamp.migrate; echo; echo -n "Rst: "; cat timestamp.restore; echo
Mig: 2024-02-29T05:56:05.263Z
Rst: 2024-02-29T06:13:11.750Z
# /usr/local/noobaa-core/node/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js glacier migrate
# echo -n "Mig: "; cat timestamp.migrate; echo; echo -n "Rst: "; cat timestamp.restore; echo
Mig: 2024-02-29T06:13:42.856Z
Rst: 2024-02-29T06:13:11.750Z

More information - Screenshots / Logs / Other output

Here is the log of 2nd manage_nsfs.js glacier migrate

# /usr/local/noobaa-core/node/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js glacier migrate; echo $?
load_nsfs_nc_config.setting config.NSFS_NC_CONF_DIR /gpfs/tapecloud/nsfs
nsfs: config_dir_path=/gpfs/tapecloud/nsfs config.json= {
  ENDPOINT_FORKS: 1,
  NOOBAA_LOG_LEVEL: 'nsfs',
  ENDPOINT_PORT: 80,
  ENDPOINT_SSL_PORT: 443,
  EP_METRICS_SERVER_PORT: 7690,
  ALLOW_HTTP: true,
  NSFS_CALCULATE_MD5: false,
  UV_THREADPOOL_SIZE: 256,
  GPFS_DL_PATH: '/usr/lpp/mmfs/lib/libgpfs.so',
  NSFS_BUF_POOL_MEM_LIMIT: 104857600,
  NSFS_BUF_SIZE: 16777216,
  NSFS_OPEN_READ_MODE: 'rd',
  NSFS_CHECK_BUCKET_BOUNDARIES: false,
  NSFS_TRIGGER_FSYNC: true,
  DENY_UPLOAD_TO_STORAGE_CLASS_STANDARD: true,
  NSFS_GLACIER_ENABLED: true,
  NSFS_GLACIER_LOGS_ENABLED: true,
  NSFS_GLACIER_BACKEND: 'TAPECLOUD',
  NSFS_GLACIER_TAPECLOUD_BIN_DIR: '/opt/ibm/tapecloud/bin',
  NSFS_GLACIER_LOGS_DIR: '/gpfs/tapecloud/nsfs/logs',
  NSFS_GLACIER_LOGS_MAX_INTERVAL: 900000,
  NSFS_GLACIER_MIGRATE_INTERVAL: 600000,
  NSFS_GLACIER_RESTORE_INTERVAL: 600000,
  NSFS_GLACIER_EXPIRY_INTERVAL: 43200000
}
2024-02-29 15:13:41.556098 [PID-1985829/TID-1985829] FS::GPFS GPFS_DL_PATH=/usr/lpp/mmfs/lib/libgpfs.so
2024-02-29 15:13:41.556264 [PID-1985829/TID-1985829] FS::GPFS found GPFS lib file GPFS_DL_PATH=/usr/lpp/mmfs/lib/libgpfs.so
2024-02-29 15:13:41.564207 [PID-1985829/TID-1985829] [L1] FS::set_debug_level 5
2024-02-29 15:13:42.710803 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Stat _path=/gpfs/tapecloud/nsfs
2024-02-29 15:13:42.711094 [PID-1985829/TID-1985862] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.711197 [PID-1985829/TID-1985862] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.711504 [PID-1985829/TID-1985862] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs  took: 0.181459 ms
(node:1985829) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023.

Please migrate your code to use AWS SDK for JavaScript (v3).
For more information, check the migration guide at https://a.co/7PzMCcy
(Use `node --trace-warnings ...` to show where the warning was created)
Feb-29 15:13:42.715 [/1985829]   [LOG] CONSOLE:: read_rand_seed: reading 32 bytes from /dev/random ...
2024-02-29 15:13:42.717541 [PID-1985829/TID-1985829] [L1] FS::Stat::OnOK: _path=/gpfs/tapecloud/nsfs _stat_res.st_ino=12546 _stat_res.st_size=4096
2024-02-29 15:13:42.718051 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Stat _path=/gpfs/tapecloud/nsfs/buckets
2024-02-29 15:13:42.718652 [PID-1985829/TID-1985865] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/buckets _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.718778 [PID-1985829/TID-1985865] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/buckets _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.718983 [PID-1985829/TID-1985865] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/buckets  took: 0.083815 ms
Feb-29 15:13:42.719 [/1985829]   [LOG] CONSOLE:: read_rand_seed: got 32 bytes from /dev/random, total 32 ...
Feb-29 15:13:42.719 [/1985829]   [LOG] CONSOLE:: read_rand_seed: closing fd ...
2024-02-29 15:13:42.720681 [PID-1985829/TID-1985829] [L1] FS::Stat::OnOK: _path=/gpfs/tapecloud/nsfs/buckets _stat_res.st_ino=36115 _stat_res.st_size=4096
2024-02-29 15:13:42.720902 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Stat _path=/gpfs/tapecloud/nsfs/accounts
2024-02-29 15:13:42.728656 [PID-1985829/TID-1985866] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/accounts _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.728776 [PID-1985829/TID-1985866] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/accounts _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.728996 [PID-1985829/TID-1985866] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/accounts  took: 0.099093 ms
Feb-29 15:13:42.729 [/1985829]   [LOG] CONSOLE:: init_rand_seed: seeding with 32 bytes
2024-02-29 15:13:42.729916 [PID-1985829/TID-1985829] [L1] FS::Stat::OnOK: _path=/gpfs/tapecloud/nsfs/accounts _stat_res.st_ino=36116 _stat_res.st_size=4096
2024-02-29 15:13:42.730146 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Stat _path=/gpfs/tapecloud/nsfs/access_keys
2024-02-29 15:13:42.739651 [PID-1985829/TID-1985870] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/access_keys _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.739843 [PID-1985829/TID-1985870] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/access_keys _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.740063 [PID-1985829/TID-1985870] [L1] FS::FSWorker::Execute: Stat _path=/gpfs/tapecloud/nsfs/access_keys  took: 0.097466 ms
2024-02-29 15:13:42.740578 [PID-1985829/TID-1985829] [L1] FS::Stat::OnOK: _path=/gpfs/tapecloud/nsfs/access_keys _stat_res.st_ino=36117 _stat_res.st_size=4096
2024-02-29 15:13:42.741196 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: FileOpen _path=/gpfs/tapecloud/nsfs/logs/cluster.lock _flags=577 _mode=438
2024-02-29 15:13:42.744339 [PID-1985829/TID-1985872] [L1] FS::FSWorker::Execute: FileOpen _path=/gpfs/tapecloud/nsfs/logs/cluster.lock _flags=577 _mode=438 _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.744512 [PID-1985829/TID-1985872] [L1] FS::FSWorker::Execute: FileOpen _path=/gpfs/tapecloud/nsfs/logs/cluster.lock _flags=577 _mode=438 _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.745684 [PID-1985829/TID-1985872] [L1] FS::FSWorker::Execute: FileOpen _path=/gpfs/tapecloud/nsfs/logs/cluster.lock _flags=577 _mode=438  took: 0.9709 ms
Feb-29 15:13:42.758 [/1985829]   [LOG] CONSOLE:: generate_entropy: entropy_avail 2048
Feb-29 15:13:42.759 [/1985829]   [LOG] CONSOLE:: init_rand_seed: done
2024-02-29 15:13:42.759490 [PID-1985829/TID-1985829] [L1] FS::FileOpen::OnOK: _path=/gpfs/tapecloud/nsfs/logs/cluster.lock
2024-02-29 15:13:42.759734 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock
2024-02-29 15:13:42.766276 [PID-1985829/TID-1985874] [L1] FS::FSWorker::Execute: FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.766407 [PID-1985829/TID-1985874] [L1] FS::FSWorker::Execute: FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.766540 [PID-1985829/TID-1985874] [L1] FS::FSWorker::Execute: FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock  took: 0.012023 ms
2024-02-29 15:13:42.766668 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: undefined FileFlock _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock
2024-02-29 15:13:42.853525 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate
2024-02-29 15:13:42.853715 [PID-1985829/TID-1985875] [L1] FS::FSWorker::Execute: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.853841 [PID-1985829/TID-1985875] [L1] FS::FSWorker::Execute: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.854127 [PID-1985829/TID-1985875] [L1] FS::FSWorker::Execute: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate  took: 0.148095 ms
2024-02-29 15:13:42.854268 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: Readfile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate
2024-02-29 15:13:42.855549 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Readdir _path=/gpfs/tapecloud/nsfs/logs
2024-02-29 15:13:42.855685 [PID-1985829/TID-1985876] [L1] FS::FSWorker::Execute: Readdir _path=/gpfs/tapecloud/nsfs/logs _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.855831 [PID-1985829/TID-1985876] [L1] FS::FSWorker::Execute: Readdir _path=/gpfs/tapecloud/nsfs/logs _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.856146 [PID-1985829/TID-1985876] [L1] FS::FSWorker::Execute: Readdir _path=/gpfs/tapecloud/nsfs/logs  took: 0.181358 ms
2024-02-29 15:13:42.856316 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: Readdir _path=/gpfs/tapecloud/nsfs/logs
2024-02-29 15:13:42.857341 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438
2024-02-29 15:13:42.857459 [PID-1985829/TID-1985877] [L1] FS::FSWorker::Execute: Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438 _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.857601 [PID-1985829/TID-1985877] [L1] FS::FSWorker::Execute: Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438 _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.858322 [PID-1985829/TID-1985877] [L1] FS::FSWorker::Execute: Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438  took: 0.479036 ms
2024-02-29 15:13:42.858510 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: undefined Writefile _path=/gpfs/tapecloud/nsfs/logs/timestamp.migrate _len=24 _mode=438
2024-02-29 15:13:42.858686 [PID-1985829/TID-1985829] [L1] FS::FSWorker::Begin: FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19
2024-02-29 15:13:42.858786 [PID-1985829/TID-1985878] [L1] FS::FSWorker::Execute: FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19 _uid=0 _gid=0 _backend=
2024-02-29 15:13:42.858986 [PID-1985829/TID-1985878] [L1] FS::FSWorker::Execute: FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19 _uid=0 _gid=0 geteuid()=0 getegid()=0 getuid()=0 getgid()=0
2024-02-29 15:13:42.859212 [PID-1985829/TID-1985878] [L1] FS::FSWorker::Execute: FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19  took: 0.017905 ms
2024-02-29 15:13:42.859308 [PID-1985829/TID-1985829] [L1] FS::FSWorker::OnOK: undefined FileClose _wrap->_path=/gpfs/tapecloud/nsfs/logs/cluster.lock _wrap->_fd=19
@tangledbytes
Copy link
Member

Hi, I tried thinking about this a little more. Should we really not update the timestamp if there was nothing to process? I don't think that there is any harm in doing so but I don't yet understand the gain yet either.

Pros - Sounds "right"
Cons - Migrate gets triggered every minute so despite trying to batch uploads happened in last ...MIGRATE_INTERVAL we just batch uploads happen between last cron scheduling and current.

WDYT @piste-jp-ibm @guymguym

@piste-jp-ibm
Copy link
Contributor Author

In my thought, it is good to update the timestamp in the case below.

  1. when manage_nsfs issues migration script
  2. when first entry is coming to migration file under the log directory
  • in other words, when noobaa creates migration log file under the log directory
  • another thought is keeping this value in to another place and decide go or no-go with 2 values, timestamp.migrate and this value

I think this implementation might keep all files are started to migrate within 11 mins at most.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants