Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance bandersnatch mirror to optionally delete packages detected as no longer found #1686

Open
89ao opened this issue Mar 18, 2024 · 2 comments
Labels
bug Something isn't working enhancement New feature or request help wanted Extra attention is needed

Comments

@89ao
Copy link
Contributor

89ao commented Mar 18, 2024

Taking the package tohoku-tus-iot-automation as an example, I saw from the logs that this package was synced down from the official source on March 6th. By March 7th, bandersnatch had detected that the upstream had already removed it (due to the package containing malicious information collection backdoors and trojans). However, our local bandersnatch had not yet deleted it. On March 18th, during troubleshooting by our operations team, they discovered this issue and manually executed "bandersnacth delete tohoku-tus-iot-automation" to remove it.

2024-03-06 20:24:02,571 bandersnatch.package: INFO Fetching metadata for package: tohoku-tus-iot-automation (serial 22195024)
2024-03-06 20:24:02,932 bandersnatch.mirror: INFO Storing index page(s): tohoku-tus-iot-automation - in /repo/web/simple/tohoku-tus-iot-automation
2024-03-06 21:28:19,422 bandersnatch.package: INFO Fetching metadata for package: tohoku-tus-iot-automation (serial 22196068)
2024-03-06 21:28:20,140 bandersnatch.mirror: INFO Storing index page(s): tohoku-tus-iot-automation - in /repo/web/simple/tohoku-tus-iot-automation
2024-03-06 21:49:18,704 bandersnatch.package: INFO Fetching metadata for package: tohoku-tus-iot-automation (serial 22196135)
2024-03-06 21:49:19,244 bandersnatch.mirror: INFO Storing index page(s): tohoku-tus-iot-automation - in /repo/web/simple/tohoku-tus-iot-automation
2024-03-06 22:10:16,922 bandersnatch.package: INFO Fetching metadata for package: tohoku-tus-iot-automation (serial 22196395)
2024-03-06 22:10:17,324 bandersnatch.mirror: INFO Storing index page(s): tohoku-tus-iot-automation - in /repo/web/simple/tohoku-tus-iot-automation
2024-03-07 00:18:20,878 bandersnatch.package: INFO Fetching metadata for package: tohoku-tus-iot-automation (serial 22198726)
2024-03-07 00:18:21,125 bandersnatch.package: INFO tohoku-tus-iot-automation no longer exists on PyPI
2024-03-18 16:23:14,613 bandersnatch: INFO Deleting path: /repo/web/json/tohoku-tus-iot-automation
2024-03-18 16:23:14,614 bandersnatch: INFO Removing file: /repo/web/json/tohoku-tus-iot-automation
2024-03-18 16:23:14,614 bandersnatch: INFO Deleting path: /repo/web/pypi/tohoku-tus-iot-automation
2024-03-18 16:23:14,614 bandersnatch: INFO Forcing removal of files under /repo/web/pypi/tohoku-tus-iot-automation

My question is, since Bandersnatch can detect that the upstream has removed https://github.com/pypa/bandersnatch/blob/main/src/bandersnatch/mirror.py#L125, why wasn't there consideration given to adding the ability for automatic deletion (or a switch)?
Are there any other considerations or scenarios that prevent us from doing so?

image
@89ao 89ao changed the title +gpt 翻译“尽管上游删除了包并且bandersnatch检测到了,但本地仍没有删除” Even though the upstream has removed the package and bandersnatch has detected it, it has not been removed locally yet. Mar 18, 2024
@cooperlees
Copy link
Contributor

cooperlees commented Mar 18, 2024

This is a good question about deletion here since bandersnatch detects it and nice proposed addition.

I would accept a new config parameter driven deletion there (maybe delete_missing_packages) that defaults to false in default.conf and then uses the metadata to deletes all of the package blobs and simple API files.

Thanks!

@cooperlees cooperlees added bug Something isn't working enhancement New feature or request help wanted Extra attention is needed labels Mar 18, 2024
@cooperlees cooperlees changed the title Even though the upstream has removed the package and bandersnatch has detected it, it has not been removed locally yet. Enhance bandersnatch mirror to optionally delete packages detected as no longer found Mar 18, 2024
@89ao
Copy link
Contributor Author

89ao commented Mar 19, 2024

Thanks a lot @cooperlees ! Looking forward to seeing the feature implemented as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants