Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: create an "hollow" filesystem #131

Open
jbd opened this issue Feb 3, 2023 · 2 comments
Open

Feature: create an "hollow" filesystem #131

jbd opened this issue Feb 3, 2023 · 2 comments

Comments

@jbd
Copy link

jbd commented Feb 3, 2023

Hello,

I'm wondering how difficult it would be to build an "hollow" filesystem image using mkdwarfs ? By "hollow" I mean having all the metadata, but replace all the files with empty sparse one. It would allow to take a "light" image of a filesystem and to exploit it with classical tools like du, ncdu amongst others (knowing of course that files are now sparse).

I've glimpsed around the source code and it looks like it would be possible by leveraging the code modularity (writing a new file_scanner::impl ?). Maybe I'm missing something that would render the implementation more complex than it looks ? I would like your advice before trying to hack around.

Thank you !

@mhx
Copy link
Owner

mhx commented Apr 1, 2023

Hi @jbd & sorry for the late response.

It unfortunately wouldn't be quite as simple as swapping out the file_scanner. There are a few more components downstream that would try to access the files and then produce actual file system data.

There are many possible approaches, and tbh I'm not entirely sure what the best way would be. I think the simplest way to achieve this would need an additional abstraction around mmap. Currently, mmap instances are created all over the place by path name and then used through the mmif interface. You'd probably need a factory (+ interface) to pass around and create either "real" file-backed mmap instances or "fake" anonymous, zero-filled ones. That way, you'd still create a fully working file system, but the contents of all files would be null bytes. Deduplication and compression would ensure the actual "data" stored is small. This doesn't require any changes to the metadata or to the logic accessing files. Also, apart from the contents of the files, the file system would behave exactly as the "real" one.

@jbd
Copy link
Author

jbd commented Apr 2, 2023

Hi @mhx, no worries at all for the late response. Thank you for having taking the time to answer.

Your explanation are quite clear and the suggestion of using an additional abstraction around mmap. Using fake zero-filled mmap instance sounds elegant and quite simple. I may try this in the first place to validate and play with this "hollow" concept I have in mind.

I'll keep you in touch if I ever get to this stage with my rusty C++ ;) Feel free to close this issue.

Thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants