Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display local ignored files feature. #9131

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

p0l0us
Copy link
Contributor

@p0l0us p0l0us commented Sep 25, 2023

Purpose

As a user I wish to see which local files are ignored by Syncthing in my receive-only folder.

A broader view of the matter will be clarified by following this draft PR:
#9130

Testing

Edit folder ignore patterns and make sure items appears in the folder details.
Other folders than receive-only should not be affected.

Screenshots

image
image

Documentation

syncthing/docs#824

@acolomb
Copy link
Member

acolomb commented Sep 26, 2023

Great, very useful addition! Why limit it to receive-only folders though? The information is definitely useful for any folder type except receive-encrypted I suppose.

@tomasz1986
Copy link
Contributor

This looks great! As @acolomb has already mentioned, is there any particular reason for limiting this to a specific folder type? This would be useful in all cases, I think.

I would like to suggest using a different icon though, as the exclamation mark indicates that something is wrong, while this isn't really the case here.

@bt90
Copy link
Contributor

bt90 commented Sep 26, 2023

fa-ban ?

@p0l0us
Copy link
Contributor Author

p0l0us commented Sep 26, 2023

Why limit it to receive-only folders though?

There is no specific reason to limit the feature for receive-only folders apart of more potential bugs and more testing.

Feature opened for send-only, send&receive and receive-only folders now.

@calmh
Copy link
Member

calmh commented Sep 28, 2023

Can we have a discussion about what this is supposed to do and what use case it fulfils? It looks to me like it implements a surprising definition of "locally ignored" and I'm not certain if I understand or misunderstand the purpose.

@p0l0us
Copy link
Contributor Author

p0l0us commented Sep 28, 2023

Can we have a discussion about what this is supposed to do and what use case it fulfils? It looks to me like it implements a surprising definition of "locally ignored" and I'm not certain if I understand or misunderstand the purpose.

In context of this PR it means "Locally existing files which are ignored".

Purpose:
This PR is supposed to expose number and list of files which physically exist on local filesystem and which are ignored by patterns.
Originally I aimed to receive-only folders, but I was asked to extend features for send folders as well if possible.

Why I need this feature ?
This PR is intermediate step to achieve #9139 which extends this branch/PR by "Remove Ignored" button allowing user to delete all ignored files from local filesystem without affecting global state of the file and other devices.

Ultimate target is summarized here: #9130
(just note the code of #9130 is PoC quality)

@calmh
Copy link
Member

calmh commented Sep 29, 2023

In context of this PR it means "Locally existing files which are ignored".

That's not really what you're doing here, though. You're basing this on files in the database with the ignored flag, and there are roughly two ways a file ends up in the database with an ignored flag:

  • The file was previously not ignored, so we've scanned and processed it and it's already in the database and announced to other devices. We can't undo that, so we update it with an ignore flag and send an update to that effect.
  • Some other device announces a file that matches our ignore patterns. We tag it with the ignored bit and send an update that we've seen it.

Notably not present in the above list are all the other, "normal" ignored files, so as designed this will return a small arbitrary subset of actually on-disk ignored files.

I can see the utility in being able to ask for which files are present and ignored, it's come up before. However, I think the proper way to do that is to walk the filesystem, running the ignore matcher, and noting which files match the patterns.

(It's not obvious to me that this needs to be part of Syncthing proper as opposed to a separate small utility, either. With the stated end goal of deleting all the files, which is also something I'm not sure we should do, it makes even more sense to me to have this be a separate utility. stfindignored --delete-all --force ~/some/folder)

@p0l0us
Copy link
Contributor Author

p0l0us commented Sep 29, 2023

That's not really what you're doing here, though. You're basing this on files in the database with the ignored flag,

See protocol.go and follow protocol.FlagLocalIgnoredExisting please. I mean I'm not using ignore flag, but own.

@calmh
Copy link
Member

calmh commented Sep 30, 2023

Yes, you've added a flag to indicate whether the ignored file exists on disk locally. This doesn't change any of my reasoning above, since that is about whether the file exists in the database to begin with, regardless of flags. (It's also incorrect and/or unnecessary since the "exists on disk" part is better figured out by looking on disk.)

@p0l0us
Copy link
Contributor Author

p0l0us commented Oct 6, 2023

Yes, you've added a flag to indicate whether the ignored file exists on disk locally. This doesn't change any of my reasoning above, since that is about whether the file exists in the database to begin with, regardless of flags. (It's also incorrect and/or unnecessary since the "exists on disk" part is better figured out by looking on disk.)

Ok, I understand point is that files added locally matching ignore patterns will be missing in my list. Then it makes sense to extend scan (walkAndHashFiles) related functions to add these file to local tree with appropriate flag(s) and without hashing to reduce performance impact.

The file was previously not ignored, so we've scanned and processed it and it's already in the database and announced to other devices. We can't undo that, so we update it with an ignore flag and send an update to that effect.

If the file is ignored, and added just locally, it should not be announced to other devices of course, but still can be listed in local database without being hashed. If such file would be removed from the filesystem later, it can be simply removed from local database since it is not published anywhere, instead marking it deleted or ignored. Does this make sense ?

I can see the utility in being able to ask for which files are present and ignored, it's come up before. However, I think the proper way to do that is to walk the filesystem, running the ignore matcher, and noting which files match the patterns.

(It's not obvious to me that this needs to be part of Syncthing proper as opposed to a separate small utility, either. With the stated end goal of deleting all the files, which is also something I'm not sure we should do, it makes even more sense to me to have this be a separate utility. stfindignored --delete-all --force ~/some/folder)

I'm trying to avoid external tool since it diverts from my target "selective downloads for receive-only folder". Also doesn't make sense to scan the filesystem once by syncthing and secondly by another tool. As "advanced" user I would like to see which files will be removed before I click it. At least it would be more safe when user enters the ignore patterns by hand and can easily make a typo or other stupid mistake.

My main question before I invest more time in this is:
Is there a real chance that solution described above will be accepted ?

Even ultimate target would be new type of folder "receiveonly-selective" adding automatic file removal according to patterns. I believe such feature, allowing comfortably download music or pictures from cloud to a device, will satisfy many users asking for selective sync.

btw. I'm sorry if I'm slower in understanding. I'm just getting familiar with the codebase and syncthing complexity. :-)

@acolomb
Copy link
Member

acolomb commented Oct 6, 2023

Is there a real chance that solution described above will be accepted ?

Thank you for this summary. I do think the feature would be extremely valuable and useful. Just have to agree with @calmh that a simple "scan the filesystem, build a list of names matching the configured ignores, in memory" would be a more realistic approach. The database can get quite big (1 - space) and busy (2 - I/O bandwith), and looking up items there probably gets slower (3 - speed) with more records stored. That's three rather sensitive areas where a user who explicitly stated that some (possibly large) portion of the folder tree should be ignored (thus, not processed at all).

Whether the list is built and kept in memory during regular scanning, or produced on demand when the API endpoint is requested, is a basic decision to make. Or maybe have both, with a threshold size: Cache the list during regular scans, but discard it again if it exceeds e.g. 10 MB. Then the API relies on the cache when available, otherwise triggers a new scan just for this purpose, without the size limit.

I do like the idea of having it integrated right there in the GUI, however in practice and with a growing number of ignored paths, a CLI tool will actually be more useful for piping to other processes (e.g. for removal). That could very well be a thin wrapper (or curl ... | jq ...) around the API endpoint though.

Even ultimate target would be new type of folder "receiveonly-selective" adding automatic file removal according to patterns. I believe such feature, allowing comfortably download music or pictures from cloud to a device, will satisfy many users asking for selective sync.

I see more problems than possible benefits with adding a new folder type. And in general, automatically deleting anything from within Syncthing, especially when it matches an ignore pattern, is a function I'd rather not see implemented. There's a large potential for catastrophic bugs here (think about our external globbing library for example - BOOM your data is gone after an update). If only made available via an explicit user interaction, it's still a giant footgun, but at least won't eat your data without a directly related, explicit user choice.

We could put a higher hurdle for accidental damage here by providing it as a separate command that is however easily set up with a trigger after a scan-finished event.

@calmh
Copy link
Member

calmh commented Oct 7, 2023

extend scan (walkAndHashFiles) related functions to add these file to local tree with appropriate flag(s) and without hashing to reduce performance impact.

I don't think this is a good idea. There are many reasons for ignoring file trees, including that scanning it may cause errors or be expensive (there might be millions of files there).

automatic file removal according to patterns

This has been suggested several times, and it is rejected based on philosophical reasons. We will not do it.

@p0l0us
Copy link
Contributor Author

p0l0us commented Oct 8, 2023

Whether the list is built and kept in memory during regular scanning, or produced on demand when the API endpoint is requested, is a basic decision to make. Or maybe have both, with a threshold size: Cache the list during regular scans, but discard it again if it exceeds e.g. 10 MB. Then the API relies on the cache when available, otherwise triggers a new scan just for this purpose, without the size limit.

I don't think this is a good idea. There are many reasons for ignoring file trees, including that scanning it may cause errors or be expensive (there might be millions of files there).

I think, user can decide himself whether he wishes to enable the feature accepting the performance impact or not. So another checkbox in folder settings or patterns tab should solve huge file trees performance concern.

Another option might be extending the ignore markup by new prefix telling that the ignored item should be scanned and listed for removal. This would add more flexibility to advanced users who wishes ignore some files but not remove all. Typically config files or thumbnails folder. And I should be still able to achieve my goal without affecting current behavior and performance.

There's a large potential for catastrophic bugs here (think about our external globbing library for example - BOOM your data is gone after an update)

This has been suggested several times, and it is rejected based on philosophical reasons. We will not do it.

Makes sense... Delete confirmation overlay should show all affected files so user will see and confirm files/folders to be removed.

@EmberLightVFX
Copy link

EmberLightVFX commented Mar 16, 2024

I think this PR together with #9139 and #9132 would give a closure to as good as all Selective Sync requests.

As a movie production company that is using Syncthing between all our artists, being able to quickly select what folders to sync and the posibility to remove the non-syncing files/folders that is located in the synced-folder would be a game-changer for us in production speed.

After trying to folow the discussion here with showing all non-syncing local files and the abbility to delete these, this is my thoughts.

What about when viewing all locally ignored files they would be shown in a tree-view like the last image in #9132 but with a delete-button next to each folder and item. That way the user have to manually select what to delete, so no auto-deletion going on, but you can also quickly delete everything within a branch of the tree-view. If you would delete the root branch you would delete all the locally ignored files.

@imsodin
Copy link
Member

imsodin commented Mar 22, 2024

@p0l0us What's your plan here? As pointed out currently you aren't doing what it sounds like you want to do: Track all ignored files that exist in the db. You'd have to scan them to do that. However as pointed out, that's definitely not desirable by default. And I personally doubt it's desirable in general: Besides what calmh said, it also adds complexity to track all this data just for a potential use of showing and deleting ignored files. This isn't information continuously and often used. It could be generated when needed directly from the filesystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants