Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude __pycache__ directories from backups using CACHEDIR.TAG #85253

Closed
jstasiak mannequin opened this issue Jun 22, 2020 · 9 comments
Closed

Exclude __pycache__ directories from backups using CACHEDIR.TAG #85253

jstasiak mannequin opened this issue Jun 22, 2020 · 9 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@jstasiak
Copy link
Mannequin

jstasiak mannequin commented Jun 22, 2020

BPO 41081
Nosy @ericvsmith, @jstasiak
PRs
  • gh-85253: Exclude __pycache__ directories from backups using CACHEDIR.TAG #21060
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-06-22.19:39:42.956>
    labels = ['type-feature', 'library', '3.10']
    title = 'Exclude __pycache__ directories from backups using CACHEDIR.TAG'
    updated_at = <Date 2020-06-22.20:54:25.799>
    user = 'https://github.com/jstasiak'

    bugs.python.org fields:

    activity = <Date 2020-06-22.20:54:25.799>
    actor = 'eric.smith'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2020-06-22.19:39:42.956>
    creator = 'jstasiak'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 41081
    keywords = ['patch']
    message_count = 2.0
    messages = ['372109', '372113']
    nosy_count = 2.0
    nosy_names = ['eric.smith', 'jstasiak']
    pr_nums = ['21060']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue41081'
    versions = ['Python 3.10']

    @jstasiak
    Copy link
    Mannequin Author

    jstasiak mannequin commented Jun 22, 2020

    It'd be nice of __pycache__ directories didn't pollute backups. Granted, one can add __pycache__ directory to their backup-tool-of-choice exclusion list, but those lists are ever growing and maybe it'd be good to help the tools and the users.

    There's a Cache Directory Tagging Specification[1] which some backup tools like Borg, restic, GNU Tar and attic use out of the box (well, with a switch) and other tools (like rsync, Bacula, rdiff-backup and I imagine others) can be made to use it with a generic exclude-directories-with-this-file-present option (partially, just the existence of the tag file is used, not its content).

    I wasn't sure what to select in Components, so I went with Library.
    [1] https://bford.info/cachedir/

    @jstasiak jstasiak mannequin added 3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jun 22, 2020
    @ericvsmith
    Copy link
    Member

    To spoil it for other readers: the linked page says to create a file named CACHEDIR.TAG with a specific first line.

    @brettcannon
    Copy link
    Member

    I'm not sure what I think about this. I get the desire, but the file isn't a standard either.

    @merwok
    Copy link
    Member

    merwok commented Dec 19, 2023

    What are the downsides of doing this?

    @jstasiak
    Copy link
    Contributor

    • Extra code to run at import-time, including IO -> in principle slower imports
    • Extra code to maintain for the benefit of a set of users that may be considered too small to be worth it (people running backup software that supports this marker)
    • Extra noise in the filesystem

    It's not a standard in an RFC kind of sense but it's somewhat recognizable in the backup software world – I know that Borg, rustic and restic support it.

    @RazerM
    Copy link
    Contributor

    RazerM commented Jan 16, 2024

    Some other tools which follow this already are mypy, pytest, pipx, cargo (Rust).

    I think it's a good idea, things like this are a real pain point when using backup software even if the impact of __pycache__ isn't particularly egregious.

    @AA-Turner
    Copy link
    Member

    It doesn't seem to hurt, beyond the obvious "code to maintain" burden, and seems to have adoption as a de facto standard, so I'd be +1. @jstasiak would you be able to benchmark if the difference of writing an additional file is noticeable for import time? If it has a measurable impact beyond the noise, we may consider an opt-out config setting for those who don't care about writing the file and want imports as fast as possible.

    A

    @jstasiak
    Copy link
    Contributor

    Yes, I'll try to run some tests.

    @gpshead gpshead removed the 3.10 only security fixes label Feb 24, 2024
    @brettcannon
    Copy link
    Member

    So I brought this up on the core dev Discord server and either got objections or lukewarm reaction. The biggest sticking point is this is no way a standard and us doing this would make it a de-facto one.

    As such, I'm afraid I need to close this until some standards body picks it up.

    @brettcannon brettcannon closed this as not planned Won't fix, can't repro, duplicate, stale Feb 27, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants