Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

be more resilient to loss of backing device #41

Open
rickysarraf opened this issue Mar 13, 2020 · 7 comments
Open

be more resilient to loss of backing device #41

rickysarraf opened this issue Mar 13, 2020 · 7 comments

Comments

@rickysarraf
Copy link

Consider this scenario:

Mar 13 20:51:26 priyasi systemd[1]: Unnecessary job for /dev/fuse was removed.
Mar 13 20:51:26 priyasi systemd[1]: Unmounting /home/rrs/.cache/catfs/chutzpah...
Mar 13 20:51:26 priyasi catfs[4583]: "/home/rrs/.cache/catfs/chutzpah" unmounted
Mar 13 20:51:26 priyasi catfs[4583]: Unmounted /home/rrs/.cache/catfs/chutzpah
Mar 13 20:51:26 priyasi systemd[3356]: home-rrs-.cache-catfs-chutzpah.mount: Succeeded.
Mar 13 20:51:26 priyasi catfs[4583]: Received TERM, attempting to unmount "/home/rrs/.cache/catfs/chutzpah"
Mar 13 20:51:26 priyasi systemd[1]: home-rrs-.cache-catfs-chutzpah.mount: Succeeded.
Mar 13 20:51:26 priyasi systemd[1]: Unmounted /home/rrs/.cache/catfs/chutzpah.
Mar 13 20:51:26 priyasi systemd[1]: Mounting /home/rrs/.cache/catfs/chutzpah...
Mar 13 20:51:26 priyasi systemd[1]: home-rrs-.cache-catfs-chutzpah.mount: Mount process finished, but there is no mount.
Mar 13 20:51:26 priyasi systemd[1]: home-rrs-.cache-catfs-chutzpah.mount: Failed with result 'protocol'.
Mar 13 20:51:26 priyasi systemd[1]: Failed to mount /home/rrs/.cache/catfs/chutzpah.
Mar 13 20:51:26 priyasi systemd[1]: mnt-chutzpah.automount: Got automount request for /mnt/chutzpah, triggered by 14940 (catfs)
Mar 13 20:51:26 priyasi systemd[1]: Mounting /mnt/chutzpah...
Mar 13 20:51:26 priyasi polkitd(authority=local)[788]: Unregistered Authentication Agent for unix-process:14923:643936 (system bus name :1.1946, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_IN.UTF-8) (disconnected from bus)
Mar 13 20:51:31 priyasi systemd[1]: mnt-chutzpah.mount: Mounting timed out. Terminating.
Mar 13 20:51:31 priyasi mount[14948]: read: Interrupted system call
Mar 13 20:51:31 priyasi systemd[3356]: mnt-chutzpah.mount: Succeeded.
Mar 13 20:51:31 priyasi systemd[1]: mnt-chutzpah.mount: Mount process exited, code=killed, status=15/TERM
Mar 13 20:51:31 priyasi systemd[1]: mnt-chutzpah.mount: Failed with result 'timeout'.
Mar 13 20:51:31 priyasi systemd[1]: Failed to mount /mnt/chutzpah.
Mar 13 20:51:31 priyasi catfs[14940]: Cannot mount: No such device (os error 19) stack backtrace:
Mar 13 20:51:31 priyasi catfs[14940]:    0:     0x555c859ecf35 - catfs::catfs::error::RError<E>::from::h895331c5ecbb2e25
Mar 13 20:51:31 priyasi catfs[14940]:                         at src/catfs/error.rs:52
Mar 13 20:51:31 priyasi catfs[14940]:    1:     0x555c85a17e45 - <catfs::catfs::error::RError<std::io::error::Error> as core::convert::From<std::io::error::Error>>::from::hb6598e8d987b0b29
Mar 13 20:51:31 priyasi catfs[14940]:                         at src/catfs/error.rs:115
Mar 13 20:51:31 priyasi catfs[14940]:    2:     0x555c859d7104 - catfs::catfs::CatFS::new::h6333f2023f3b932d
Mar 13 20:51:31 priyasi catfs[14940]:                         at src/catfs/mod.rs:114
Mar 13 20:51:31 priyasi catfs[14940]:    3:     0x555c85a5afec - catfs::main_internal::h301b7b64c624340c
Mar 13 20:51:31 priyasi catfs[14940]:                         at src/main.rs:237
Mar 13 20:51:31 priyasi catfs[14940]:    4:     0x555c85a580a0 - catfs::main::hfa37ddebbee1dd6d
Mar 13 20:51:31 priyasi catfs[14940]:                         at src/main.rs:40
Mar 13 20:51:31 priyasi catfs[14940]:    5:     0x555c85a5d8df - std::rt::lang_start::{{closure}}::hf3ef3448014deb0d
Mar 13 20:51:31 priyasi catfs[14940]:                         at /usr/src/rustc-1.40.0/src/libstd/rt.rs:61
Mar 13 20:51:31 priyasi catfs[14940]:    6:     0x555c85dea5a2 - _ZN3std9panicking3try7do_call17h405fa073712ab5d5E.llvm.17613216718872221368
Mar 13 20:51:31 priyasi catfs[14940]:    7:     0x555c85df8389 - __rust_maybe_catch_panic
Mar 13 20:51:31 priyasi catfs[14940]:    8:     0x555c85df0138 - std::rt::lang_start_internal::he63ceb3eba03dd55
Mar 13 20:51:31 priyasi catfs[14940]:    9:     0x555c85a5d8b8 - std::rt::lang_start::h80110ac54506f8b6
Mar 13 20:51:31 priyasi catfs[14940]:                         at /usr/src/rustc-1.40.0/src/libstd/rt.rs:61
Mar 13 20:51:31 priyasi catfs[14940]:   10:     0x555c85a5baa9 - main
Mar 13 20:51:31 priyasi catfs[14940]:   11:     0x7faacc3b3bba - __libc_start_main
Mar 13 20:51:31 priyasi catfs[14940]:   12:     0x555c859a61e9 - _start
Mar 13 20:51:31 priyasi catfs[14940]:   13:                0x0 - <unknown>
  • The backing device (a remote share accessible over sshfs) is not persistent
  • But the local cache is persistent

Under circumstances where the backing device is lost (network interruption, network change, roaming profile etc), the data should still be served transparently from the cache

@kahing
Copy link
Owner

kahing commented Mar 17, 2020

currently catfs cannot work offline as it always reach out to the backing store for metadata

@goncalopp
Copy link

I also think this is an important feature for a caching filesystem.

@kahing
Is this functionality that catfs could potentially have, or is it against design goals?
Would you be willing to review PRs? How hard do you estimate the task to be?

@gaul
Copy link
Collaborator

gaul commented May 4, 2020

If you just want to retry and sleep in a loop like NFS this is easy to do.

@goncalopp
Copy link

@gaul
Thanks gaul! Do you mean sleeping and blocking reads until the backing store is online again?
rickysarraf's idea seems to be that reading should still succeed without the backing store being present, and directory listings are fulfilled from the local cache. This would require catfs to be comfortable serving stale data, without stating the backing store.

@gaul
Copy link
Collaborator

gaul commented May 5, 2020

Writing a disconnected filesystem is a lot of work! I recommend starting with a retry and sleep loop.

@gaul
Copy link
Collaborator

gaul commented Jul 23, 2020

Offline-Filesystem exists but looks like it has bit-rotted.

@goncalopp
Copy link

pcachefs does this brilliantly, FWIW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants